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BACKGROUND 
Zinc finger proteins (ZFPs) are proteins that can bind to DNA in a 
sequence-specific manner. Zinc fingers were first identified in the transcription factor 
TFIIIA from the oocytes of the African clawed toad, Xenopus laevis. An exemplary 

1 5 motif characterizing one class of these protein (C 2 H 2 class) is -Cys-(X) 2 -4-Cys-(X)i 2 -His- 
(X) 3 . 5 -His (where X is any amino acid) (SEQ. ID. No:l). A single finger domain is about 
30 amino acids in length, and several structural studies have demonstrated that it contains 
an alpha helix containing the two invariant histidine residues and two invariant cysteine 
residues in a beta turn co-ordinated through zinc. To date, over 10,000 zinc finger 

20 sequences have been identified in several thousand known or putative transcription 

factors. Zinc finger domains are involved not only in DNA-recognition, but also in RNA 
binding and in protein-protein binding. Current estimates are that this class of molecules 
will constitute about 2% of all human genes. 

The x-ray crystal structure of Zif268, a three-finger domain from a murine 

25 transcription factor, has been solved in complex with a cognate DNA-sequence and 
shows that each finger can be superimposed on the next by a periodic rotation. The 
structure suggests that each finger interacts independently with DNA over 3 base-pair 
intervals, with side-chains at positions -1, 2 , 3 and 6 on each recognition helix making 
contacts with their respective DNA triplet subsites. The amino terminus of Zif268 is 

30 situated at the 3 ' end of the DNA strand with which it makes most contacts. Some zinc 
fingers can bind to a fourth base in a target segment. If the strand with which a zinc 
finger protein makes most contacts is designated the target strand, some zinc finger 
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proteins bind to a three base triplet in the target strand and a fourth base on the nontarget 
strand. The fourth base is complementary to the base immediately 3' of the three base 
subsite. 

The structure of the Zif268-DNA complex also suggested that the DNA 
5 sequence specificity of a zinc finger protein might be altered by making amino acid 
substitutions at the four helix positions (-1, 2, 3 and 6) on each of the zinc finger 
recognition helices. Phage display experiments using zinc finger combinatorial libraries 
to test this observation were published in a series of papers in 1994 (Rebar et al., Science 
263, 671-673 (1994); Jamieson et al, Biochemistry 33, 5689-5695 (1994); Choo et al, 

1 0 PNAS 9 1 , 1 1 1 63- 1 1 1 67 (1 994)) . Combinatorial libraries were constructed with 

randomized side-chains in either the first or middle finger of Zif268 and then used to 
select for an altered Zi£268 binding site in which the appropriate DNA sub-site was 
replaced by an altered DNA triplet. Further, correlation between the nature of introduced 
mutations and the resulting alteration in binding specificity gave rise to a partial set of 

1 5 substitution rules for design of ZFPs with altered binding specificity. 

Greisman & Pabo, Science 275, 657-661 (1997) discuss an elaboration of 
the phage display method in which each finger of a Zif268 was successively randomized 
and selected for binding to a new triplet sequence. This paper reported selection of ZFPs 
for a nuclear hormone response element, a p53 target site and a TATA box sequence. 

20 A number of papers have reported attempts to produce ZFPs to modulate 

particular target sites. For example, Choo et al., Nature 372, 645 (1994), report an 
attempt to design a ZFP that would repress expression of a brc-abl oncogene. The target 
segment to which the ZFPs would bind was a nine base sequence 5'GCA GAA GCC3' 
chosen to overlap the junction created by a specific oncogenic translocation fusing the 

25 genes encoding brc and abl. The intention was that a ZFP specific to this target site 

would bind to the oncogene without binding to abl or brc component genes. The authors 
used phage display to screen a mini-library of variant ZFPs for binding to this target 
segment. A variant ZFP thus isolated was then reported to repress expression of a stably 
transfected brc-able construct in a cell line. 

30 Pomerantz et al., Science 267, 93-96 (1995) reported an attempt to design 

a novel DNA binding protein by fusing two fingers from Zif268 with a homeodomain 
from Oct-1 . The hybrid protein was then fused with a transcriptional activator for 
expression as a chimeric protein. The chimeric protein was reported to bind a target site 



representing a hybrid of the subsites of its two components. The authors then constructed 
a reporter vector containing a luciferase gene operably linked to a promoter and a hybrid 
site for the chimeric DNA binding protein in proximity to the promoter. The authors 
reported that their chimeric DNA binding protein could activate expression of the 
5 luciferase gene. 

Liu et al., PNAS 94, 5525-5530 (1997) report forming a composite zinc 
finger protein by using a peptide spacer to link two component zinc finger proteins each 
having three fingers. The composite protein was then further linked to transcriptional 
activation domain. It was reported that the resulting chimeric protein bound to a target 

10 site formed from the target segments bound by the two component zinc finger proteins. It 
was further reported that the chimeric zinc finger protein could activate transcription of a 
reporter gene when its target site was inserted into a reporter plasmid in proximity to a 
promoter operably linked to the reporter. 

Choo et al., WO 98/53058, WO98/53059, and WO 98/53060 (1998) 

15 discuss selection of zinc fmger proteins to bind to a target site within the HIV Tat gene. 
Choo et al. also discuss selection of a zinc finger protein to bind to a target site 
encompassing a site of a common mutation in the oncogene ras. The target site within ras 
was thus constrained by the position of the mutation. 

The present application is related to commonly owned copending applications 

20 09/229,007 filed January 12, 1999 and 09/229,037 filed January 12, 1999. 

SUMMARY OF THE CLAIMED INVENTION 
Tables 1-5 show the amino acid sequences of a large collection of zinc 
finger proteins and corresponding target sites bound by the proteins. Nucleotide 
25 sequences of target sites are shown in Col. 2. Target sites typically have 9 or 10 bases 
and constitute three target subsites bound by respective zinc finger components of a 
multifinger protein. Amino acid sequences of zinc fmger components are shown in cols. 
4, 6 and 8. The amino acids shown occupy positions -1 to +6 of a zinc finger. Table 6 
shows consensus sequences for zinc fingers and target subsites bound by the fingers. Col. 
30 1 shows the nucleotides occupying a target subsite. Cols. 2-4 show amino acids 
occupying positions -1 to +6 of zinc fingers binding to a target subsite. 

Accordingly, the invention provides zinc fingers having amino acid 
sequences and target subsite binding specificies shown in Table 6. As an example, a zinc 



finger having the amino acid sequence DXSNXXR at positions -1 to +6 has a target 
subsite GAC. As an other example, a zinc finger having the amino acid sequence 
RX(D/S)NXXR at positions -1 to +6 has a target subsite of GAG. A zinc finger having 
an amino acid sequence TXGNXXR at positions -1 to +6 has the target subsite GAT. A 
zinc finger having the sequence (Q/T)XSNXXR at positions -1 to +6 binds to a target 
subsite GAT. A zinc finger having an amino acid sequence QXG(S/D)XXR at positions 
-1 to +6 binds to a target subsite GCA. A zinc finger having an amino acid sequence 
RXDEXXR binds to a target subsite GCG. A zinc finger having an amino acid sequence 
QXSDXXR at positions -1 to +6 binds to a target subsite GCT. A zinc finger having an 
amino acid sequence QX(G/A)HXXR at positions -1 to +6 binds to a target subsite GGA. 
A zinc finger having an amino acid sequence DXSHXXR binds to a target subsite GGC. 
A zinc finger having an amino acid sequence RXDHXXR at positions -1 to +6 binds to a 
target substite GGG. A zinc finger having an amino acid sequence RXDAXXR at 
positions -1 to +6 binds to a target subsite GTG. 

The invention further provides nucleic acid encoding zinc fingers, 
including all of the zinc fingers described above. 

The invention further provides segments of a zinc finger 
comprising a sequence of seven contiguous amino acids as shown in any of Tables 1-5. 
The invention also provides nucleic acids encoding any of these segments and zinc 
fingers comprising the same. 

The invention further provides zinc finger proteins comprising 
first, second and third zinc fingers. The first, second and third zinc fingers comprise 
respectively first, second and third segments of seven contiguous amino acids as shown in 
a row of Tables 1-5. The invention further provides nucleic acids encoding such zinc 
finger proteins. 

BRTEF DESCRIPTION OF THE FIGURE 
Fig. 1 shows assembly of nucleic acids encoding zinc finger binding 

proteins. 

DEFINITIONS 

A zinc finger DNA binding protein is a protein or segment within a larger 
protein that binds DNA in a sequence-specific manner as a result of stabilization of 
protein structure through cordination of a zinc ion. The term zinc finger DNA binding 
protein is often abbreviated as zinc finger protein or ZFP. 
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A designed zinc finger protein is a protein not occurring in nature whose 
design/composition results principally from rational criteria. Rational criteria for design 
include application of substitution rules and computerized algorithms for processing 
information in a database storing information of existing ZFP designs and binding data. . 
5 A selected zinc finger protein is a protein not found in nature whose 

production results primarily from an empirical process such as phage display. 

The term naturally-occurring is used to describe an object that can be 
found in nature as distinct from being artificially produced by man. For example, a 
polypeptide or polynucleotide sequence that is present in an organism (including viruses) 

10 that can be isolated from a source in nature and which has not been intentionally modified 
by man in the laboratory is naturally-occurring. Generally, the term naturally-occurring 
refers to an object as present in a non-pathological (undiseased) individual, such as would 
be typical for the species. 

A nucleic acid is operably linked when it is placed into a functional 

15 relationship with another nucleic acid sequence. For instance, a promoter or enhancer is 
operably linked to a coding sequence if it increases the transcription of the coding 
sequence. Operably linked means that the DNA sequences being linked are typically 
contiguous and, where necessary to join two protein coding regions, contiguous and in 
reading frame. However, since enhancers generally function when separated from the 

20 promoter by up to several kilobases or more and intronic sequences may be of variable 
lengths, some polynucleotide elements may be operably linked but not contiguous. 

A specific binding affinity between, for example, a ZFP and a specific 
target site means a binding affinity of at least 1 x 10 6 M" 1 . 

The terms "modulating expression" "inhibiting expression" and "activating 

25 expression" of a gene refer to the ability of a zinc finger protein to activate or inhibit 
transcription of a gene. Activation includes prevention of subsequent transcriptional 
inhibition (i.e., prevention of repression of gene expression) and inhibition includes 
prevention of subsequent transcriptional activation (i.e., prevention of gene activation). 
Modulation can be assayed by determining any parameter that is indirectly or directly 

30 affected by the expression of the target gene. Such parameters include, e.g., changes in 
RNA or protein levels, changes in protein activity, changes in product levels, changes in 
downstream gene expression, changes in reporter gene transcription (luciferase, CAT, 
beta-galactosidase, GFP (see, e.g., Mistili & Spector, Nature Biotechnology 15:961-964 



(1997)); changes in signal transduction, phosphorylation and dephosphorylation, receptor- 
ligand interactions, second messenger concentrations (e.g., cGMP, cAMP, IP3, and 
Ca2+), cell growth, neovascularization, in vitro, in vivo, and ex vivo. Such functional 
effects can be measured by any means known to those skilled in the art, e.g., 
5 measurement of RNA or protein levels, measurement of RNA stability, identification of 
downstream or reporter gene expression, e.g., via chemiluminescence, fluorescence, 
colorimetric reactions, antibody binding, inducible markers, ligand binding assays; 
changes in intracellular second messengers such as cGMP and inositol triphosphate (IP3); 
changes in intracellular calcium levels; cytokine release, and the like. 
10 A "regulatory domain" refers to a protein or a protein subsequence that has 

transcriptional modulation activity. Typically, a regulatory domain is covalently or non- 
covalently linked to a ZFP to modulate transcription. Alternatively, a ZFP can act alone, 
without a regulatory domain, or with multiple regulatory domains to modulate 
transcription. 

1 5 A D-able subsite within a target site has the motif 5 'NNGK3 ' . A target 

site containing one or more such motifs is sometimes described as a D-able target site. A 
zinc finger appropriately designed to bind to a D-able subsite is sometimes referred to as 
a D-able finger. Likewise a zinc finger protein containing at least one finger designed or 
selected to bind to a target site including at least one D-able subsite is sometimes referred 

20 to as a D-able zinc finger protein. 

DETAILED DESCRIPTION 

I. General 

Tables 1-5 list a collection of nonnaturally occurring zinc finger protein 
25 sequences and their corresponding target sites. The first column of each table is an 

internal reference number. The second column lists a 9 or 10 base target site bound by a 
three-finger zinc finger protein, with the target sites listed in 5' to 3' orientation. The 
third column provides SEQ ID NOs for the target site sequences listed in column 2. The 
fourth, sixth and eighth columns list amino acid residues from the first, second and third 
30 fingers, respectively, of a zinc finger protein which recognizes the target sequence listed 
in the second column. For each finger, seven amino acids, occupying positions -1 to +6 
of the finger, are listed. The numbering convention for zinc fingers is defined below. 
Columns 5, 7 and 9 provide SEQ ID NOs for the amino acid sequences listed in columns 



4, 6 and 8, respectively. The final column of each table lists the binding affinity (i.e., the 
Kd in nM) of the zinc finger protein for its target site. Binding affinities are measured as 
described below. 

Each finger binds to a triplet of bases within a corresponding target 
5 sequence. The first finger binds to the first triplet starting from the 3 ' end of a target site, 
the second finger binds to the second triplet, and the third finger binds the third (i.e., the 
5 '-most) triplet of the target sequence. For example, the RSDSLTS finger (SEQ ID 
NO: 646) of SBS# 201 (Table 2) binds to 5'TTG3', the ERSTLTR finger (SEQ ID 
NO: 851) binds to5'GCC3' and the QRADLRR finger (SEQ ID NO: 1056) binds to 
10 5'GCA3'. 

Table 6 lists a collection of consensus sequences for zinc fingers and the 
target sites bound by such sequences. Conventional one letter amino acid codes are used 
to designate amino acids occupying consensus positions. The symbol "X" designates a 
nonconsensus position that can in principle be occupied by any amino acid. In most zinc 

15 fingers of the C2H2 type, binding specificity is principally conferred by residues -1, +2, 
+3 and +6. Accordingly, consensus sequence determining binding specificity typically 
include at least these residues. Consensus sequences are useful for designing zinc fingers 
to bind to a given target sequence. Residues occupying other positions can be selected 
based on sequences in Tables 1-5, or other known zinc finger sequences. Alternatively, 

20 these positions can be randomized with a plurality of candidate amino acids and screened 
against one or more target sequences to refine binding specificity or improve binding 
specificity. In general, the same consensus sequence can be used for design of a zinc 
finger regardless of the relative position of that finger in a multi-finger zinc finger 
protein. For example, the sequence RXDNXXR can be used to design a N-terminal, 

25 central or C-terminal finger of three finger protein. However, some consensus sequences 
are most suitable for designing a zinc finger to occupy a particular position in a multi- 
finger protein. For example, the consensus sequence RXDHXXQ is most suitable for 
designing a C-terminal finger of a three-finger protein. 



30 II. Characteristics of Zinc Finger Proteins 

Zinc finger proteins are formed from zinc finger components. For 
example, zinc finger proteins can have one to thirty-seven fingers, commonly having 2, 3, 
4, 5 or 6 fingers. A zinc finger protein recognizes and binds to a target site (sometimes 
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referred to as a target segment) that represents a relatively small subsequence within a 
target gene. Each component finger of a zinc finger protein can bind to a subsite within 
the target site. The subsite includes a triplet of three contiguous bases all on the same 
strand (sometimes referred to as the target strand). The subsite may or may not also 
5 include a fourth base on the opposite strand that is the complement of the base 

immediately 3' of the three contiguous bases on the target strand. In many zinc finger 
proteins, a zinc finger binds to its triplet subsite substantially independently of other 
fingers in the same zinc finger protein. Accordingly, the binding specificity of zinc 
finger protein containing multiple fingers is usually approximately the aggregate of the 

10 specificities of its component fingers. For example, if a zinc finger protein is formed 
from first, second and third fingers that individually bind to triplets XXX, YYY, and 
ZZZ, the binding specificity of the zinc finger protein is 3 'XXX YYY ZZZ5'. 

The relative order of fingers in a zinc finger protein from N-terminal to C- 
terminal determines the relative order of triplets in the 3' to 5' direction in the target. For 

15 example, if a zinc finger protein comprises from N-terminal to C-terminal first, second 
and third fingers that individualy bind, respectively, to triplets 5' GAC3', 5'GTA3' and 
5"GGC3 ' then the zinc finger protein binds to the target segment 3 'CAGATGCGG5 ' . If 
the zinc finger protein comprises the fingers in another order, for example, second finger, 
first finger, third finger, then the zinc finger protein binds to a target segment comprising 

20 a different permutation of triplets, in this example, 3 ' ATGCAGCGG5' (see Berg & Shi, 
Science 271, 1081-1086 (1996)). The assessment of binding properties of a zinc finger 
protein as the aggregate of its component fingers may, in some cases, be influenced by 
context-dependent interactions of multiple fingers binding in the same protein. 

Two or more zinc finger proteins can be linked to have a target specificity 

25 that is the aggregate of that of the component zinc finger proteins (see e.g., Kim & Pabo, 
PNAS 95, 2812-2817 (1998)). For example, a first zinc finger protein having first, second 
and third component fingers that respectively bind to XXX, YYY and ZZZ can be linked 
to a second zinc finger protein having first, second and third component fingers with 
binding specificities, AAA, BBB and CCC. The binding specificity of the combined first 

30 and second proteins is thus 3'XXXYYYZZZ AAABBBCCC5', where the underline 
indicates a short intervening region (typically 0-5 bases of any type). In this situation, the 
target site can be viewed as comprising two target segments separated by an intervening 
segment. 
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Linkage can be accomplished using any of the following peptide linkers. 
T G E K P: (SEQ. ID. No:2) (Liu et al., 1997, supra.); (G4S)n (SEQ. ID. No:3) (Kim et 
al., PNAS93, 1156-1160 (1996.); GGRRGGGS; (SEQ. ID. No:4) LRQRDGERP; (SEQ. 
ID. No:5) LRQKD GGGS ERP ; (SEQ. ID. No:6) LRQKD(G3S)2 ERP (SEQ. ID. No:7) 
5 Alternatively, flexible linkers can be rationally designed using computer programs 
capable of modeling both DNA-binding sites and the peptides themselves or by phage 
display methods . In a further variation, noncovalent linkage can be achieved by fusing 
two zinc finger proteins with domains promoting heterodimer formation of the two zinc 
finger proteins. For example, one zinc finger protein can be fused with fos and the other 

10 with jun (see Barbas et al., WO 95/1 19431). 

Linkage of two zinc finger proteins is advantageous for conferring a 
unique binding specificity within a mammalian genome. A typical mammalian diploid 
genome consists of 3 x 10 9 bp. Assuming that the four nucleotides A, C, G, and T are 
randomly distributed, a given 9 bp sequence is present ~23,000 times. Thus a ZFP 

1 5 recognizing a 9 bp target with absolute specificity would have the potential to bind to 
-23,000 sites within the genome. An 18 bp sequence is present once in 3.4 x 10 10 bp, or 
about once in a random DNA sequence whose complexity is ten times that of a 
mammalian genome. 

A component finger of zinc finger protein typically contains about 30 

20 amino acids and has the following motif (N-C) : 

(SEQ. ID. No: 8) 

Cys- (X) 2 _ 4 -Cys-X.X.X.X.X.X.X.X.X.X.X.X-His- (X) 3 _ 5 -His 
-11234567 
The two invariant histidine residues and two invariant cysteine residues in 
25 a single beta turn are co-ordinated through zinc (see, e.g., Berg & Shi, Science 271, 1081- 
1085 (1996)). The above motif shows a numbering convention that is standard in the 
field for the region of a zinc finger conferring binding specificity. The amino acid on the 
left (N-terminal side) of the first invariant His residues is assigned the number +6, and 
other amino acids further to the left are assigned successively decreasing numbers. The 
30 alpha helix begins at residue 1 and extends to the residue following the second conserved 
histidine. The entire helix is therefore of variable length, between 1 1 and 13 residues. 

The process of designing or selecting a nonnaturally occurring or variant 
ZFP typically starts with a natural ZFP as a source of framework residues. The process of 
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design or selection serves to define nonconserved positions (i.e., positions -1 to +6) so as 

to confer a desired binding specificity. One suitable ZFP is the DNA binding domain of 

the mouse transcription factor Zif268. The DNA binding domain of this protein has the 

amino acid sequence: 
5 YACPVESCDRRFSRSDELTRHIRTHTGQKP (Fl) (SEQ. ID No:9) 

FQCRICMRNFSRSDHLTTHIRTHTGEKP (F2) (SEQ. ID. No: 10) 

FACDICGRKFARSDERKRHTKIHLRQK (F3) SEQ. ID. No: 11) 

and binds to a target 5' GCG TGG GCG 3' (SEQ ID No: 12). 

Another suitable natural zinc finger protein as a source of framework 
10 residues is Sp-1 . The Sp-1 sequence used for construction of zinc finger proteins 

corresponds to amino acids 531 to 624 in the Sp-1 transcription factor. This sequence is 

94 amino acids in length. The amino acid sequence of Sp-1 is as follows: 

PGKKKQHICHIQGCGKVYGKTSHLRAHLRWHTGERP 

FMCTWSYCGKRFTRSDELQRHKRTHTGEKK 
15 FACPECPKRFMRSDHLSKHIKTHQNKKG (SEQ. ID. No: 13) 

Sp-1 binds to a target site 5'GGG GCG GGG3' (SEQ ID No: 14). 

An alternate form of Sp-1, an Sp-1 consensus sequence, has the following 

amino acid sequence: 

meklrngsgd 

20 PGKKKQHACPECGKSFSKSSHLRAHQRTHTGERP 
YKCPECGKSFSRSDELQRHQRTHTGEKP 

YKCPECGKSFSRSDHLSKHQRTHQNKKG (SEQ. ID. No: 15) (lower case letters are a 
leader sequence from Shi & Berg, Chemistry and Biology 1, 83-89. (1995). The optimal 
binding sequence for the Sp-1 consensus sequence is 5'GGGGCGGGG3' (SEQ ID No: 

25 16) . Other suitable ZFPs are described below. 

There are a number of substitution rules that assist rational design of some 
zinc finger proteins (see Desjarlais & Berg, PNAS 90, 2256-2260 (1993); Choo & Klug, 
PNAS 91,11 163-1 1 167 (1994); Desjarlais & Berg, PNAS 89, 7345-7349 (1992); 
Jamieson et al., supra; Choo et al., WO 98/53057, WO 98/53058; WO 98/53059; WO 

30 98/53060). Many of these rules are supported by site-directed mutagenesis of the three- 
finger domain of the ubiquitous transcription factor, Sp-1 (Desjarlais and Berg, 1992; 
1993). One of these rules is that a 5' G in a DNA triplet can be bound by a zinc finger 
incorporating arginine at position 6 of the recognition helix. Another substitution rule is 
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that a G in the middle of a subsite can be recognized by including a histidine residue at 
position 3 of a zinc finger. A further substitution rule is that asparagine can be 
incorporated to recognize A in the middle of triplet, aspartic acid, glutamic acid, serine or 
threonine can be incorporated to recognize C in the middle of triplet, and amino acids 
5 with small side chains such as alanine can be incorporated to recognize T in the middle of 
triplet. A further substitution rule is that the 3' base of triplet subsite can be recognized 
by incorporating the following amino acids at position -1 of the recognition helix: 
arginine to recognize G, glutamine to recognize A, glutamic acid (or aspartic acid) to 
recognize C, and threonine to recognize T. Although these substitution rules are useful 

10 in designing zinc finger proteins they do not take into account all possible target sites. 
Furthermore, the assumption underlying the rules, namely that a particular amino acid in 
a zinc finger is responsible for binding to a particular base in a subsite is only 
approximate. Context-dependent interactions between proximate amino acids in a finger 
or binding of multiple amino acids to a single base or vice versa can cause variation of the 

1 5 binding specificities predicted by the existing substitution rules. 

The technique of phage display provides a largely empirical means of 
generating zinc finger proteins with a desired target specificity (see e.g., Rebar, US 
5,789,538; Choo et al., WO 96/06166; Barbas et al., WO 95/19431 and WO 98/543111; 
Jamieson et al., supra). The method can be used in conjunction with, or as an alternative 

20 to rational design. The method involves the generation of diverse libraries of 

mutagenized zinc finger proteins, followed by the isolation of proteins with desired DNA- 
binding properties using affinity selection methods. To use this method, the experimenter 
typically proceeds as follows. First, a gene for a zinc finger protein is mutagenized to 
introduce diversity into regions important for binding specificity and/or affinity. In a 

25 typical application, this is accomplished via randomization of a single finger at positions 
-1, +2, +3, and +6, and sometimes accessory positions such as +1, +5, +8 and +10. Next, 
the mutagenized gene is cloned into a phage or phagemid vector as a fusion with gene III 
of a filamentous phage, which encodes the coat protein pill . The zinc finger gene is 
inserted between segments of gene III encoding the membrane export signal peptide and 

30 the remainder of pill , so that the zinc finger protein is expressed as an ammo-terminal 
fusion with pill or in the mature, processed protein. When using phagemid vectors, the 
mutagenized zinc finger gene may also be fused to a truncated version of gene III 
encoding, minimally, the C-terminal region required for assembly of pill into the phage 
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particle. The resultant vector library is transformed into E. coli and used to produce 
filamentous phage which express variant zinc finger proteins on their surface as fusions 
with the coat protein pill. If a phagemid vector is used, then the this step requires 
superinfection with helper phage. The phage library is then incubated with target DNA 
5 site, and affinity selection methods are used to isolate phage which bind target with high 
affinity from bulk phage. Typically, the DNA target is immobilized on a solid support, 
which is then washed under conditions sufficient to remove all but the tightest binding 
phage. After washing, any phage remaining on the support are recovered via elution 
under conditions which disrupt zinc finger - DNA binding. Recovered phage are used to 

10 infect fresh E. coli., which is then amplified and used to produce a new batch of phage 
particles. Selection and amplification are then repeated as many times as is necessary to 
enrich the phage pool for tight binders such that these may be identified using sequencing 
and/or screening methods. Although the method is illustrated for pill fusions, analogous 
principles can be used to screen ZFP variants as pVIII fusions. 

15 In certain embodiments, the sequence bound by a particular zinc finger 

protein is determined by conducting binding reactions (see, e.g., conditions for 
determination of Kid, infra) between the protein and a pool of randomized double-stranded 
oligonucleotide sequences. The binding reaction is analyzed by an electrophoretic 
mobility shift assay (EMSA), in which protein-DNA complexes undergo retarded 

20 migration in a gel and can be separated from unbound nucleic acid. Oligonucleotides 
which have bound the finger are purified from the gel and amplified, for example, by a 
polymerase chain reaction. The selection {i.e. binding reaction and EMSA analysis) is 
then repeated as many times as desired, with the selected oligonucleotide sequences. In 
this way, the binding specificity of a zinc finger protein having a particular amino acid 

25 sequence is determined. 

Zinc finger proteins are often expressed with a heterologous domain as 
fusion proteins. Common domains for addition to the ZFP include, e.g., transcription 
factor domains (activators, repressors, co-activators, co-repressors), silencers, oncogenes 
(e.g., myc, jun, fos, myb, max, mad, rel, ets, bcl, myb, mos family members etc.); DNA 

30 repair enzymes and their associated factors and modifiers; DNA rearrangement enzymes 
and their associated factors and modifiers; chromatin associated proteins and their 
modifiers (e.g. kinases, acetylases and deacetylases); and DNA modifying enzymes (e.g., 
methyltransferases, topoisomerases, helicases, ligases, kinases, phosphatases, 



polymerases, endonucleases) and their associated factors and modifiers. A preferred 
domain for fusing with a ZFP when the ZFP is to be used for represssing expression of a 
target gene is a KRAB repression domain from the human KOX-1 protein (Thiesen et al., 
New Biologist 2, 363-374 (1990); Margolin et al., Proc. Natl. Acad. Sci. USA 91, 4509- 
5 4513 (1994); Pengue et al., Nucl. Acids Res. 22:2908-2914 (1994); Witzgall et al., Proc. 
Natl. Acad. Sci. USA 91, 4514-4518 (1994). Preferred domains for achieving activation 
include the HSV VP16 activation domain (see, e.g., Hagmann et al., J. Virol. 71, 5952- 
5962 (1997)) nuclear hormone receptors (see, e.g., Torchia et al., Curr. Opin. Cell. Biol. 
10:373-383 (1998)); the p65 subunit of nuclear factor kappa B (Bitko & Barik, J. Virol. 

10 72:5610-5618 (1998)and Doyle & Hunt, Neuroreport 8:2937-2942 (1997)); Liu et al., 
Cancer Gene Ther. 5:3-28 (1998)), or artificial chimeric functional domains such as 
VP64 (Seifpal et al., EMBOJ. 11, 4961-4968 (1992)). 

An important factor in the administration of polypeptide compounds, such 
as the ZFPs, is ensuring that the polypeptide has the ability to traverse the plasma 

15 membrane of a cell, or the membrane of an intra-cellular compartment such as the 
nucleus. Cellular membranes are composed of lipid-protein bilayers that are freely 
permeable to small, nonionic lipophilic compounds and are inherently impermeable to 
polar compounds, macromolecules, and therapeutic or diagnostic agents. However, 
proteins and other compounds such as liposomes have been described, which have the 

20 ability to translocate polypeptides such as ZFPs across a cell membrane. 

For example, "membrane translocation polypeptides" have amphiphilic or 
hydrophobic amino acid subsequences that have the ability to act as membrane- 
translocating carriers. In one embodiment, homeodomain proteins have the ability to 
translocate across cell membranes. The shortest internalizable peptide of a homeodomain 

25 protein, Antennapedia, was found to be the third helix of the protein, from amino acid 
position 43 to 58 {see, e.g. , Prochiantz, Current Opinion in Neurobiology 6:629-634 
(1996)). Another subsequence, the h (hydrophobic) domain of signal peptides, was found 
to have similar cell membrane translocation characteristics (see, e.g., Lin et al, J. Biol. 
Chem. 270:1 4255-14258 (1995)). 

30 Examples of peptide sequences which can be linked to a ZFP of the 

invention, for facilitating uptake of ZFP into cells, include, but are not limited to: an 1 1 
animo acid peptide of the tat protein of HIV; a 20 residue peptide sequence which 
corresponds to amino acids 84-103 of the pl6 protein (see Fahraeus et al, Current 
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Biology 6:84 (1996)); the third helix of the 60-amino acid long homeodomain of 
Antennapedia (Derossi et al, J. Biol. Chem. 269:10444 (1994)); the h region of a signal 
peptide such as the Kaposi fibroblast growth factor (K-FGF) h region (Lin et al, supra); 
or the VP22 translocation domain from HSV (Elliot & O'Hare, Cell 88:223-233 (1997)). 
5 Other suitable chemical moieties that provide enhanced cellular uptake may also be 
chemically linked to ZFPs. 

Toxin molecules also have the ability to transport polypeptides across cell 
membranes. Often, such molecules are composed of at least two parts (called "binary 
toxins"): a translocation or binding domain or polypeptide and a separate toxin domain or 

10 polypeptide. Typically, the translocation domain or polypeptide binds to a cellular 

receptor, and then the toxin is transported into the cell. Several bacterial toxins, including 
Clostridium perfringens iota toxin, diphtheria toxin (DT), Pseudomonas exotoxin A (PE), 
pertussis toxin (PT), Bacillus anthracis toxin, and pertussis adenylate cyclase (CYA), 
have been used in attempts to deliver peptides to the cell cytosol as internal or amino- 

15 terminal fusions (Arora et al, J. Biol Chem., 268:3334-3341 (1993); Perelle et al, Infect. 
Immun., 61:5147-5156 (1993); Stenmark et al, J. Cell Biol. 113:1025-1032 (1991); 
Donnelly et al., PNAS 90:3530-3534 (1993); Carbonetti et al, Abstr. Annu. Meet. Am. 
Soc. Microbiol. 95:295 (1995); Sebo et al, Infect. Immun. 63:3851-3857 (1995); Klimpel 
etal, PNAS U.S.A. 89:10277-10281 (1992); and Novak et al, J. Biol. Chem. 267:17186- 

20 17193 1992)). 

Such subsequences can be used to translocate ZFPs across a cell 
membrane. ZFPs can be conveniently fused to or derivatized with such sequences. 
Typically, the translocation sequence is provided as part of a fusion protein. Optionally, a 
linker can be used to link the ZFP and the translocation sequence. Any suitable linker can 
25 be used, e.g., a peptide linker. 

Production of ZFPs 

ZFP polypeptides and nucleic acids encoding the same can be made using 
routine techniques in the field of recombinant genetics. Basic texts disclosing the general 
30 methods of use in this invention include Sambrook et al., Molecular Cloning, A 
Laboratory Manual (2nd ed. 1989); Kriegler, Gene Transfer and Expression: A 
Laboratory Manual (1990); and Current Protocols in Molecular Biology (Ausubel et al., 
eds., 1994)). In addition, nucleic acids less than about 100 bases can be custom ordered 
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from any of a variety of commercial sources, such as The Midland Certified Reagent 
Company (mcrc@oligos.com), The Great American Gene Company 
(http://www.genco.com), ExpressGen Inc. (www.expressgen.com), Operon Technologies 
Inc. (Alameda, CA). Similarly, peptides can be custom ordered from any of a variety of 
5 sources, such as PeptidoGenic (pkim@ccnet.com), HTI Bio-products, inc. 
(http://www.htibio.com), BMA Biomedicals Ltd (U.K.), Bio. Synthesis, Inc. 

Oligonucleotides can be chemically synthesized according to the solid 
phase phosphoramidite triester method first described by Beaucage & Caruthers, 
Tetrahedron Letts. 22:1859-1862 (1981), using an automated synthesizer, as described in 

10 Van Devanter et al., Nucleic Acids Res. 12:6159-6168 (1984). Purification of 

oligonucleotides is by either denaturing polyacrylamide gel electrophoresis or by reverse 
phase HPLC. The sequence of the cloned genes and synthetic oligonucleotides can be 
verified after cloning using, e.g., the chain termination method for sequencing double- 
stranded templates of Wallace et al, Gene 16:21-26 (1981). 

15 Two alternative methods are typically used to create the coding sequences 

required to express newly designed DNA-binding peptides. One protocol is a PCR-based 
assembly procedure that utilizes six overlapping oligonucleotides (Fig. 1). Three 
oligonucleotides (oligos 1, 3, and 5 in Figure 1) correspond to "universal" sequences that 
encode portions of the DNA-binding domain between the recognition helices. These 

20 oligonucleotides typically remain constant for all zinc finger constructs. The other three 
"specific" oligonucleotides (oligos 2, 4, and 6 in Fig. 1) are designed to encode the 
recognition helices. These oligonucleotides contain substitutions primarily at positions - 
1, 2, 3 and 6 on the recognition helices making them specific for each of the different 
DNA-binding domains. 

25 The PCR synthesis is carried out in two steps. First, a double stranded 

DNA template is created by combining the six oligonucleotides (three universal, three 
specific) in a four cycle PCR reaction with a low temperature annealing step, thereby 
annealing the oligonucleotides to form a DNA "scaffold." The gaps in the scaffold are 
filled in by high-fidelity thermostable polymerase, the combination of Taq and Pfu 

30 polymerases also suffices. In the second phase of construction, the zinc finger template is 
amplified by external primers designed to incorporate restriction sites at either end for 
cloning into a shuttle vector or directly into an expression vector. 
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An alternative method of cloning the newly designed DNA-binding 
proteins relies on annealing complementary oligonucleotides encoding the specific 
regions of the desired ZFP. This particular application requires that the oligonucleotides 
be phosphorylated prior to the final ligation step. This is usually performed before setting 
5 up the annealing reactions. In brief, the "universal" oligonucleotides encoding the 
constant regions of the proteins (oligos 1, 2 and 3 of above) are annealed with their 
complementary oligonucleotides. Additionally, the "specific" oligonucleotides encoding 
the finger recognition helices are annealed with their respective complementary 
oligonucleotides. These complementary oligos are designed to fill in the region which 

1 0 was previously filled in by polymerase in the above-mentioned protocol. The 

complementary oligos to the common oligos 1 and finger 3 are engineered to leave 
overhanging sequences specific for the restriction sites used in cloning into the vector of 
choice in the following step. The second assembly protocol differs from the initial 
protocol in the following aspects: the "scaffold" encoding the newly designed ZFP is 

1 5 composed entirely of synthetic DNA thereby eliminating the polymerase fill-in step, 
additionally the fragment to be cloned into the vector does not require amplification. 
Lastly, the design of leaving sequence-specific overhangs eliminates the need for 
restriction enzyme digests of the inserting fragment. Alternatively, changes to ZFP 
recognition helices can be created using conventional site-directed mutagenesis methods. 

20 Both assembly methods require that the resulting fragment encoding the 

newly designed ZFP be ligated into a vector. Ultimately, the ZFP-encoding sequence is 
cloned into an expression vector. Expression vectors that are commonly utilized include, 
but are not limited to, a modified pMAL-c2 bacterial expression vector (New England 
BioLabs or an eukaryotic expression vector, pcDNA (Promega). The final constructs are 

25 verified by sequence analysis. 

Any suitable method of protein purification known to those of skill in the 
art can be used to purify ZFPs of the invention (see, Ausubel, supra, Sambrook, supra). 
In addition, any suitable host can be used for expression, e.g., bacterial cells, insect cells, 
yeast cells, mammalian cells, and the like. 

30 Expression of a zinc finger protein fused to a maltose binding protein 

(MBP-ZFP) in bacterial strain JM109 allows for straightforward purification through an 
amylose column (NEB). High expression levels of the zinc finger chimeric protein can 
be obtained by induction with IPTG since the MBP-ZFP fusion in the pMal-c2 expression 
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plasmid is under the control of the tac promoter (NEB). Bacteria containing the MBP- 
ZFP fusion plasmids are inoculated into 2xYT medium containing lOuM ZnC12, 0.02% 
glucose, plus 50 u.g/ml ampicillin and shaken at 37°C. At mid-exponential growth IPTG 
is added to 0.3 mM and the cultures are allowed to shake. After 3 hours the bacteria are 
5 harvested by centrifugation, disrupted by sonication or by passage through a french 
pressure cell or through the use of lysozyme, and insoluble material is removed by 
centrifugation. The MBP-ZFP proteins are captured on an amylose-bound resin, washed 
extensively with buffer containing 20 mM Tris-HCl (pH 7.5), 200 mM NaCl, 5 mM DTT 
and 50 uM ZnC12 , then eluted with maltose in essentially the same buffer (purification is 

10 based on a standard protocol from NEB). Purified proteins are quantitated and stored for 
biochemical analysis. 

The dissociation constants of the purified proteins, e.g., Kd, are typically 
characterized via electrophoretic mobility shift assays (EMSA) (Buratowski & Chodosh, 
in Current Protocols in Molecular Biology pp. 12.2.1-12.2.7 (Ausubel ed., 1996)). 

1 5 Affinity is measured by titrating purified protein against a fixed amount of labeled 
double-stranded oligonucleotide target. The target typically comprises the natural 
binding site sequence flanked by the 3 bp found in the natural sequence and additional, 
constant flanking sequences. The natural binding site is typically 9 bp for a three-finger 
protein and 2 x 9 bp + intervening bases for a six finger ZFP. The annealed 

20 oligonucleotide targets possess a 1 base 5' overhang which allows for efficient labeling of 
the target with T4 phage polynucleotide kinase. For the assay the target is added at a 
concentration of 1 nM or lower (the actual concentration is kept at least 10-fold lower 
than the expected dissociation constant), purified ZFPs are added at various 
concentrations, and the reaction is allowed to equilibrate for at least 45 min. In addition 

25 the reaction mixture also contains 1 0 mM Tris (pH 7.5), 1 00 mM KC1, 1 mM MgC12, 0. 1 
mM ZnC12, 5 mM DTT, 10% glycerol, 0.02% BSA. (NB: in earlier assays poly d(IC) 
was also added at 10-100 ug/ul.) 

The equilibrated reactions are loaded onto a 10% polyacrylamide gel, 
which has been pre-run for 45 min in Tris/glycine buffer, then bound and unbound 

30 labeled target is resolved by electrophoresis at 150V. (alternatively, 10-20% gradient 
Tris-HCl gels, containing a 4% polyacrylamide stacker, can be used) The dried gels are 
visualized by autoradiography or phosphorimaging and the apparent Kd is determined by 
calculating the protein concentration that gives half-maximal binding. 
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The assays can also include determining active fractions in the protein 
preparations. Active fractions are determined by stoichiometric gel shifts where proteins 
are titrated against a high concentration of target DNA. Titrations are done at 100, 50, 
and 25% of target (usually at micromolar levels). 

5 

III. Applications of Designed ZFPs 

ZPFs that bind to a particular target gene, and the nucleic acids encoding 
them, can be used for a variety of applications. These applications include therapeutic 
methods in which a ZFP or a nucleic acid encoding it is administered to a subject and 

10 used to modulate the expression of a target gene within the subject (see copending 
application Townsend & Townsend & Crew Attorney Docket 019496-002200, filed 
January 12, 1999). The modulation can be in the form of repression, for example, when 
the target gene resides in a pathological infecting microrganisms, or in an endogenous 
gene of the patient, such as an oncogene or viral receptor, that is contributing to a disease 

15 state. Alternatively, the modulation can be in the form of activation when activation of 
expression or increased expression of an endogenous cellular gene can ameliorate a 
diseased state. For such applications, ZFPs, or more typically, nucleic acids encoding 
them are formulated with a pharmaceutically acceptable carrier as a pharmaceutical 
composition. 

20 Pharmaceutically acceptable carriers are determined in part by the 

particular composition being administered, as well as by the particular method used to 
administer the composition, (see, e.g., Remington 's Pharmaceutical Sciences, 17 th ed. 
1985)). The ZFPs, alone or in combination with other suitable components, can be made 
into aerosol formulations (i.e., they can be "nebulized") to be administered via inhalation. 

25 Aerosol formulations can be placed into pressurized acceptable propellants, such as 
dichlorodifluoromethane, propane, nitrogen, and the like. Formulations suitable for 
parenteral administration, such as, for example, by intravenous, intramuscular, 
intradermal, and subcutaneous routes, include aqueous and non-aqueous, isotonic sterile 
injection solutions, which can contain antioxidants, buffers, bacteriostats, and solutes that 

30 render the formulation isotonic with the blood of the intended recipient, and aqueous and 
non-aqueous sterile suspensions that can include suspending agents, solubilizers, 
thickening agents, stabilizers, and preservatives. Compositions can be administered, for 
example, by intravenous infusion, orally, topically, intraperitoneally, intravesically or 
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intrathecally. The formulations of compounds can be presented in unit-dose or multi- 
dose sealed containers, such as ampules and vials. Injection solutions and suspensions 
can be prepared from sterile powders, granules, and tablets of the kind previously 
described. 

5 The dose administered to a patient should be sufficient to effect a 

beneficial therapeutic response in the patient over time. The dose is determined by the 
efficacy and Kd of the particular ZFP employed, the target cell, and the condition of the 
patient, as well as the body weight or surface area of the patient to be treated. The size of 
the dose also is determined by the existence, nature, and extent of any adverse side-effects 
1 0 that accompany the administration of a particular compound or vector in a particular 
patient 

In other applications, ZFPs are used in diagnostic methods for sequence 
specific detection of target nucleic acid in a sample. For example, ZFPs can be used to 
detect variant alleles associated with a disease or phenotype in patient samples. As an 

1 5 example, ZFPs can be used to detect the presence of particular mRNA species or cDNA 
in a complex mixtures of mRNAs or cDNAs. As a further example, ZFPs can be used to 
quantify copy number of a gene in a sample. For example, detection of loss of one copy 
of a p53 gene in a clinical sample is an indicator of susceptibility to cancer. In a further 
example, ZFPs are used to detect the presence of pathological microorganisms in clinical 

20 samples. This is achieved by using one or more ZFPs specific to genes within the 
microorganism to be detected. A suitable format for performing diagnostic assays 
employs ZFPs linked to a domain that allows immobilization of the ZFP on an ELISA 
plate. The immobilized ZFP is contacted with a sample suspected of containing a target 
nucleic acid under conditions in which binding can occur. Typically, nucleic acids in the 

25 sample are labeled (e.g., in the course of PCR amplification). Alternatively, unlabelled 
probes can be detected using a second labelled probe. After washing, bound-labelled 
nucleic acids are detected. 

ZFPs also can be used for assays to determine the phenotype and function 
of gene expression. Current methodologies for determination of gene function rely 

30 primarily upon either overexpression or removing (knocking out completely) the gene of 
interest from its natural biological setting and observing the effects. The phenotypic 
effects observed indicate the role of the gene in the biological system. 
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One advantage of ZFP-mediated regulation of a gene relative to 
conventional knockout analysis is that expression of the ZFP can be placed under small 
molecule control. By controlling expression levels of the ZFPs, one can in turn control 
the expression levels of a gene regulated by the ZFP to determine what degree of 
5 repression or stimulation of expression is required to achieve a given phenotypic or 

biochemical effect. This approach has particular value for drug development. By putting 
the ZFP under small molecule control, problems of embryonic lethality and 
developmental compensation can be avoided by switching on the ZFP repressor at a later 
stage in mouse development and observing the effects in the adult animal. Transgenic 

1 0 mice having target genes regulated by a ZFP can be produced by integration of the 
nucleic acid encoding the ZFP at any site in trans to the target gene. Accordingly, 
homologous recombination is not required for integration of the nucleic acid. Further, 
because the ZFP is trans-dominant, only one chromosomal copy is needed and therefore 
functional knock-out animals can be produced without backcrossing. 

15 All references cited above are hereby incorporated by reference in their 

entirety for all purposes. 
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TABLE 1 







SEQ 




SEQ 


SEQ 




SEQ 


Kd 


SRS# TARGET 


ID 


Fl 


ID 


F2 ID 


F3 


ID 


(nM) 


249 


GCGGGGGCG 


17 


RSDELTR 


123 


RSDHLSR 22 9 


RSDELRR 


335 


20 


250 


GCGGGGGCG 


18 


RSDELTR 


124 


RSDHLSR 23 0 


RSDTLKK 


336 


70 


251 


GCGGAGGCG 


19 


RSDELTR 


125 


RSDNLTR 2 31 


RSDELRR 


337 


27.5 


252 


GCGGCCGCG 


20 


RSDELTR 


126 


DRSSLTR 232 


RSDELRR 


338 


100 


253 


GGATGGGGG 


21 


RSDHLAR 


127 


RSDHLTT 23 3 


QRAHLAR 33 9 


0 . 75 


256 


GCGGGGTCC 


22 


ERGDLTT 


128 


RSDHLSR 2 34 


RSDELRR 


340 


800 


258 


GCGGGCGGG 


23 


RSDHLTR 


129 


ERGHLTR 23 5 


RSDELRR 


341 


15 


259 


GCAGAGGAG 


24 


RSDNLAR 


130 


RSDNLAR 2 3 6 


QSGSLTR 


342 


250 


261 


GAGGTGGCC 


25 


ERGTLAR 


131 


RSDALSR 23 7 


RSDNLSR 


343 


0 . 5 


262 


GCGGGGGCT 


26 


QSSDLQR 


132 


RSDHLSR 23 8 


RSDELRR 


344 


20 


263 


GCGGGGGCT 


27 


QSSDLQR 


133 


RSDHLSR 23 9 


RSDTLKK 


345 


1 


264 


GTGGCTGCC 


28 


DRSSLTR 


134 


QSSDLQR 24 0 


RSDALAR 


346 


27 


265 


GTGGCTGCC 


29 


ERGTLAR 


135 


QSSDLQR 241 


RSDALAR 


347 


600 


269 


GGGGCCGGG 


30 


RSDHLTR 


136 


DRSSLTR 242 


RSDHLTR 


348 


5 


270 


GGGGCCGGG 


31 


RSDHLTR 


137 


ERGTLAR 24 3 


RSDHLTR 


349 


52 . 5 


272 


GCAGGGGCC 


32 


DRSSLTR 


138 


RSDHLSR 244 


QSGSLTR 


350 


20 


337 


TGCGGGGCAA 33 


RSADLTR 


139 


RSDHLTR 245 


ERQHLAT 


351 


24 


338 


TGCGGGGCAA 34 


RSADLTR 


140 


RSDHLTR 246 


ERDHLRT 


352 


8 


339 


TGCGGGGCAA 3 5 


RSADLTR 


141 


RSDHLTT 24 7 


ERQHLAT 


353 


64 


340 


TGCGGGGCAA 3 6 


RSADLTR 


142 


RSDHLTT 24 8 


ERDHLRT 


354 


48 


341 


TGCGGGGCAA 3 7 


RSADLTR 


143 


RGDHLKD 24 9 


ERQHLAT 


355 


1000 


342 


TGCGGGGCAA 3 8 


RSADLTR 


144 


RGDHLKD 2 50 


ERDHLRT 


356 


1000 


343 


TGCGGGGCAA 3 9 


QSGSLTR 


145 


RSDHLTR 2 51 


ERQHLAT 


357 


8 


344 


TGCGGGGCAA 4 0 


QSGSLTR 


146 


RSDHLTR 2 52 


ERDHLRT 


358 


6 


345 


TGCGGGGCAA 4 1 


QSGSLTR 


147 


RSDHLTT 2 53 


ERQHLAT 


359 


96 


346 


TGCGGGGCAA 4 2 


QSGSLTR 


148 


RSDHLTT 2 54 


ERDHLRT 


360 


64 


347 


TGCGGGGCAA 4 3 


QSGSLTR 


149 


RGDHLKD 2 55 


ERQHLAT 


361 


1000 



348 


TGCGGGGCAA 44 


QSGSLTR 


150 


367 


GGGGGCGGG 


45 


RSDHLTR 


151 


368 


GAGGGGGCG 


46 


RSDELTR 


152 


369 


GTAGTTGTG 


47 


RSDALTR 


153 


370 


GTAGTTGTG 


48 


RSDALTR 


154 


371 


GTAGTTGTG 


49 


RSDALTR 


155 


372 


GTAGTTGTG 


50 


RSDSLLR 


156 


373 


GTAGTTGTG 


51 


RSDSLLR 


157 


374 


GCTGAGGAA 


52 


QRSNLVR 


158 


375 


GAGGAAGAT 


53 


QQSNLAR 


159 


401 


GTAGTTGTG 


54 


RSDALTR 


160 


403 


GTAGTTGTG 


55 


RSDSLLR 


161 


421 


GTAGTTGTG 


56 


DSDSLLR 


162 


422 


GTAGTTGTG 


57 


RSDSLLR 


163 


423 


GTAGTTGTG 


58 


RSDALTR 


164 


424 


GATGCTGAG 


59 


RSDNLTR 


165 


425 


GATGCTGAG 


60 


RSDNLTR 


166 


426 


GATGCTGAG 


61 


RSDNLTR 


167 


427 


GCTGAGGAA 


62 


QRSNLVR 


168 


428 


GAAGATGAC 


63 


DSSNLTR 


169 


429 


GAAGATGAC 


64 


DSSNLTR 


170 


430 


GATGACGAC 


65 


EKANLTR 


171 


431 


GACGACGGC 


66 


DSGHLTR 


172 


432 


GACGACGGC 


67 


DSGHLTR 


173 


433 


GACGACGGC 


68 


DSGNLTR 


174 


434 


GACGGCGTA 


69 


QSASLTR 


175 


435 


GACGGCGTA 


70 


QSASLTR 


176 


436 


GACGGCGTA 


71 


QRSALAR 


177 


437 


GACGGCGTA 


72 


QRSALAR 


178 


438 


GAGGGGGCG 


73 


RSDELTR 


179 


440 


GCCGAGGTGC 74 


RSDSLLR 


180 


441 


GGTGGAGTCA 7 5 


DSGSLTR 


181 


445 


GTCGCAGTGA 76 


RSDSLRR 


182 



RGDHLKD 2 56 ERDHLRT 3 62 10 0 0 

DSGHLTR 257 RSDHLQR 3 63 60 

RSDHLTR 258 RSDNLTR 364 3.5 

TGGSLAR 259 QSGSLTR 3 65 95 

NRATLAR 2 60 QSASLTR 3 66 3 00 

NRATLAR 2 61 QSGSLTR 3 67 175 

TGGSLAR 2 62 QSASLTR 3 68 112.5 

NRATLAR 2 63 QSASLTR 3 69 32 0 

RSDNLTR 2 64 TSSELQR 370 3.3 

QSGNLQR 2 65 RSDNLTR 3 71 85 

TGGSLAR 2 66 QSASLTR 3 72 8 0 

NRATLAR 267 QSGSLTR 373 750 

TGGSLAR 268 QSGSLTR 374 500 

TGGSLTR 2 69 QSGSLTR 375 200 

TGGSLAR 270 QRSALAR 3 76 1000 

TSSELQR 271 TSANLSR 3 77 100 

QSSDLQR 2 72 QQSNLAR 3 78 25 

QSSDLQR 2 73 TSANLSR 379 5.5 

RSDNLTR 2 74 QSSDLQR 3 80 1 

QQSNLAR 2 75 QRSNLVR 381 12 0 

TSANLSR 276 QRSNLVR 3 82 5 0 

DSSNLTR 2 77 QQSNLAR 3 83 250 

DRSNLER 2 78 DSSNLTR 384 100 

DHANLAR 2 79 DSSNLTR 385 1000 

DHANLAR 2 8 0 DSSNLTR 386 1000 

DSGHLTR 2 81 EKANLTR 387 152.5 

DSGHLTR 282 ERGNLTR 3 88 150 

DSGHLTR 2 83 EKANLTR 3 89 95 

DSGHLTR 2 84 ERGNLTR 3 90 117.5 

RSDHLTT 2 85 RSDNLTR 3 91 62.5 

RS KNLQR 2 8 6 ERGTLAR 3 92 4 0 

QSGHLQR 2 8 7 TSGHLTR 3 93 250 

QSSDLQK 288 DSGSLTR 394 1000 



450 GACTTGGTGC 77 
453 GGTGGAGTCA 7 8 
4 61 GAGTACTGTA 79 
4 63 GTGGAGGAGA 8 0 
4 64 GTGGAGGAGA 81 
4 66 CAGGCTGCGC 82 
4 67 CAGGCTGCGC 83 
4 68 CAGGCTGCGC 84 
469 GAAGAGGTCT 8 5 
4 72 GAGGTCTGGA 86 
4 76 GGAGAGGATG 8 7 
4 77 GGAGAGGATG 8 8 
478 GGAGAGGATG 8 9 
4 79 GTGGCGGACC 9 0 
480 GTGGCGGACC 91 
4 83 GAGGGCGAAG 92 
4 84 GAGGGCGAAG 93 
4 85 GGAGAGGTTT 94 

487 GGAGAGGTTT 95 

488 TGGTAGGGGG 96 

4 90 TAGGGGGTGG 97 
503 GCCGAGGTGC 98 

5 04 GCCGAGGTGC 99 
505 GCCGAGGTGC 10 0 
52 6 GCGGGCGGGC 101 

543 GAGTGTGTGA 102 

544 GAGTGTGTGA 103 
54 5 GAGTGTGTGA 104 
54 6 GAGTGTGTGA 105 
54 7 GAGTGTGTGA 10 6 
54 8 GAGTGTGTGA 107 

54 9 GAGTGTGTGA 108 

55 0 GAGTGTGTGA 10 9 
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RSDTLAR 


183 


RGDALTS 2 89 


DRSNLTR 


395 


130 


DRSALAR 


184 


QSGHLQR 2 90 


DSSKLSR 


396 


150 


QRSHLTT 


185 


DRSNLRT 2 91 


RSDNLAR 


397 


120 


RSDNLTR 


186 


RSDNLAR 2 92 


RSDALAR 


398 


0 . 5 


RSDNLTR 


187 


RSDNLAR 2 93 


RSDSLAR 


399 


0.4 


RSDDLTR 


188 


QSSDLQR 2 94 


RSDNLRE 


400 


65 


RSDELTR 


189 


QSSDLQR 2 95 


RGDHLKD 


401 


800 


RSDDLTR 


190 


QSSDLQR 2 96 


RGDHLKD 


402 


42 


DRSALAR 


191 


RSDNLAR 2 9 7 


QSGNLTR 4 03 


13 . 5 


RSSHLTT 


192 


DRSALAR 2 98 


RSDNLAR 


404 


80 


TTSNLRR 


193 


RSDNLAR 2 99 


QSDHLTR 4 05 


80 


TTSNLRR 


194 


RSDNLAR 3 0 0 


QRAHLAR 


406 


100 


TTSNLRR 


195 


RSDNLAR 3 01 


QSGHLRR 4 07 


60 


DSSNLTR 


196 


RSDELQR 3 02 


RSDALAR 


408 


8 . 5 


DSSNLTR 


197 


RADTLRR 3 03 


RSDALAR 


409 


5 


QSANLAR 


198 


ESSKLKR 3 04 


RSDNLAR 


410 


13 0 


QSDNLAR 


199 


ESSKLKR 305 


RSDNLAR 


411 


1000 


QSSALAR 


200 


RSDNLAR 3 06 


QRAHLAR 412 


110 


NRATLAR 


201 


RSDNLAR 3 07 


QSGHLAR 


413 


76.9 


RSDHLAR 


202 


RSDNLTT 3 08 


RSDHLTT 


414 


35 


RSDSLLR 


203 


RSDHLTR 3 09 


RSDNLTT 


415 


1 . 5 


RSDSLLR 


204 


RSDNLAR 310 


ERGTLAR 


416 


50 


RSDSLLR 


205 


RSDNLAR 311 


DRSDLTR 


417 


25 


RSDSLLR 


206 


RSDNLAR 312 


DCRDLAR 


418 


65 


RSDHLTR 


207 


ERGHLTR 313 


RSDTLKK 


419 


8 


RSDLLQR 


208 


MSHHLKE 314 


RSDHLSR 


420 


50 


RSDSLLR 


209 


MSHHLKE 315 


RSDNLAR 


421 


125 


RKDSLVR 


210 


TSDHLAS 316 


RSDNLTR 


422 


32 


RSDLLQR 


211 


MCHTTT 7T O 1 *7 

i v ioririJ_irv.l ji / 


RLDGLRT 


423 


500 


RKDSLVR 


212 


TSGHLTS 318 


RSDNLTR 


424 


500 


RSSLLQR 


213 


MSHHLKT 319 


RSDHLSR 


425 


500 


RSSLLQR 


214 


MSHHLKE 32 0 


RSDHLSR 


426 


500 


RKDSLVR 


215 


TKDHLAS 321 


RSDNLTR 


427 


20 
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551 


GAGTGTGTGA 110 


RSDLLQR 


216 


MSHHLKT 3 2 2 


RSDHLSR 


428 


50 


552 


GAGTGTGTGA 111 


RKDSLVR 


217 


MSHHLKT 32 3 


RSDNLTR 


429 


31 


553 


GAGTGTGTGA 112 


RSDSLLR 


218 


MSHHLKE 3 24 


RSDNLTR 


430 


125 


554 


GAGTGTGTGA 113 


RKDSLVR 


219 


TSDHLAS 325 


RSDNLAR 


431 


62 . 5 


558 


TGCGGGGCA 114 


QSGDLTR 


220 


RSDHLTR 32 6 


DSGHLAS 


432 


21 


559 


GAGTGTGTGA 115 


RSDSLLR 


221 


TSDHLAS 32 7 


RSDNLAR 


433 


1000 


560 


GAGTGTGTGA 116 


RSSLLQR 


222 


MSHHLKT 32 8 


RSDHLSR 


434 


500 


561 


GAGTGTGTGA 117 


RKDSLVR 


223 


MSHHLKE 32 9 


RSDNLAR 


435 


1000 


562 


GAGTGTGTGA 118 


RSDSLLR 


224 


TSGHLTS 33 0 


RSDNLAR 


436 


1000 


565 


GATGCTGAG 119 


RSDNLTR 


225 


TSSELQR 331 


QQSNLAR 43 7 


100 


567 


GAAGATGAC 12 0 


EKANLTR 


226 


TSANLSR 332 


QRSNLVR 


438 


47 . 5 


568 


GATGACGAC 121 


EKANLTR 


227 


DSSNLTR 33 3 


TSANLSR 


439 


300 


569 


GTAGTTGTG 122 


RSDSLLR 


228 


TGGSLAR 3 34 


QRSALTR 


440 


52 



TABLE 2 



SBSfl 




SEP SEC 




SEP 


: SEQ 


Kd 




TARGET 


ID 


Fl ID 


F2 


ID 


F3 ID 


(nM) 


201 


GCAGCCTTG 


441 


RSDSLTS 646 


ERSTLTR 


851 


QRADLRR 1056 


1000 


202 


GCAGCCTTG 


442 


RSDSLTS 647 


ERSTLTR 


852 


QRADLAR 10 57 


1000 


203 


GCAGCCTTG 


443 


RSDSLTS 64 8 


ERSTLTR 


853 


QRATLRR 1058 


1000 


204 


GCAGCCTTG 


444 


RSDSLTS 64 9 


ERSTLTR 


854 


QRATLAR 105 9 


1000 


205 


GAGGTAGAA 


445 


QSANLAR 65 0 


QSATLAR 


855 


RSDNLSR 1060 


80 


206 


GAGGTAGAA 


446 


QSANLAR 651 


QSAVLAR 


856 


RSDNLSR 10 61 


1000 


207 


GAGTGGTTA 


447 


QRASLAS 652 


RSDHLTT 


857 


RSDNLAR 1062 


70 


208 


TAGGTCTTA 


448 


QRASLAS 653 


DRSALAR 858 


RSDNLAS 10 63 


1000 


209 


GGAGTGGTT 


449 


QSSALAR 654 


RSDALAR 


859 


QRAHLAR 10 64 


35 


210 


GGAGTGGTT 


450 


NRDTLAR 65 5 


RSDALAR 


860 


QRAHLAR 10 65 


65 


211 


GGAGTGGTT 


451 


QSSALAR 656 


RSDALAS 


861 


QRAHLAR 10 66 


140 


212 


GGAGTGGTT 


452 


NRDTLAR 657 


RSDALAS , 


862 


QRAHLAR 1067 


400 


213 


GTTGCTGGA 


453 


QRAHLAR 65 8 


QSSTLAR i 


363 


QSSALAR 10 68 


1000 


214 


GTTGCTGGA 


454 


QRAHLAR 65 9 


QSSTLAR ! 


364 


NRDTLAR 10 6 9 


1000 


215 


GAAGTCTGT 


455 


NRDHLMV 660 


DRSALAR j 


365 


QSANLSR 107 0 


1000 


216 


GAAGTCTGT 


456 


NRDHLTT 661 


DRSALAR { 


366 


QSANLSR 10 71 


1000 


217 


GAGGTCGTA 


457 


QRSALAR 6 62 


DRSALAR I 


367 


RSDNLAR 10 72 


40 


219 


GATGTTGAT 


458 


QQSNLAR 663 


NRDTLAR f 


368 


NRDNLSR 1073 


1000 


220 


GATGTTGAT 


459 


QQSNLAR 664 


NRDTLAR i 


369 


QQSNLSR 10 74 


1000 


221 


GATGAGTAC 


460 


DRSNLRT 6 65 


RSDNLAR £ 


370 


NRDNLAR 1075 


1000 


222 


GATGAGTAC 


461 


ERSNLRT 666 


RSDNLAR 871 


NRDNLAR 10 76 




223 


GATGAGTAC 


462 


DRSNLRT 667 


RSDNLAR £ 


572 


QQSNLAR 10 77 


105 


224 


GATGAGTAC 


463 


ERSNLRT 668 


RSDNLAR £ 


S73 


QQSNLAR 10 78 


1000 


225 


TGGGAGGTC 


464 


DRSALAR 66 9 


RSDNLAR £ 


!74 


RSDHLTT 10 79 


6 


226 


GCAGCCTTG 


465 


RGDALTS 670 


ERGTLAR £ 


:75 


QSGSLTR 108 0 


1000 


227 


GCAGCCTTG 


466 


RGDALTV 671 


ERGTLAR 8 


176 


QSGSLTR 1081 


1000 


228 


GCAGCCTTG 


467 


RGDALTM 672 


ERGTLAR 8 


77 


QSGSLTR 1082 


1000 


229 


GCAGCCTTG 


468 


RGDALTS 673 


ERGTLAR 8 


78 


RSDELTR 1083 


1000 



230 GCAGCCTTG 469 RGDALTV 674 

231 GCAGCCTTG 470 RGDALTM 675 

232 GGTGTGGTG 471 RS DALTR 676 

233 GGTGTGGTG 4 72 RSDALTR 677 
23 5 GTAGAGGTG 4 73 RSDALTR 678 
23 6 GGGGAGGGG 4 74 RSDHLAR 679 
23 7 GGGGAGGCC 4 75 ERGTLAR 68 0 
23 8 GGGGAGGCC 4 76 ERGTLAR 681 

23 9 GGCGGGGAG 4 77 RSDNLTR 682 

24 0 GCAGGGGAG 4 78 RSDNLTR 683 

242 GGGGGTGCT 479 QSSDLRR 684 

243 GTGGGCGCT 48 0 QSSDLRR 685 

244 TAAGAAGGG 4 81 RSDHLAR 68 6 
24 5 TAAGAAGGG 4 82 RS DHLAR 68 7 
24 6 GAAGGGGAG 483 RSDNLAR 68 8 
24 7 GAAGGGGAG 484 RSDNLAR 68 9 
2 76 GCGGCCGCG 4 85 RSDELTR 690 

277 GCGGCCGCG 486 RSDELTR 691 

278 GCGGCCGCG 487 QSWELTR 692 

279 GCGGCCGCG 488 QSWELTR 693 

280 GCGGCCGCG 489 QSGSLTR 694 
2 81 GCGGCCGCG 4 90 QSGSLTR 6 95 
2 82 GCAGAAGTG 4 91 RGDALTR 696 
2 83 GCAGAAGTG 4 92 RSDALTR 6 97 
284 GCGGCCGCG 493 QSGSLTR 6 98 
2 85 TGTGCGGCC 4 94 ERGTLAR 699 
2 87 GCAGAAGCG 4 95 RGPDLAR 70 0 
288 GCAGAAGCG 4 96 RGPDLAR 701 
2 89 GCAGAAGCG 4 97 RGPDLAR 7 02 
2 90 GCAGAAGCG 4 98 RSDELAR 7 03 

292 GCAGAAGCG 499 RSDELTR 704 

293 GTGTGCGGC 500 DRSHLTR 7 05 
2 96 TGCGCGGCC 5 01 ERGTLAR 7 06 
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ERGTLAR 


879 


RSDELTR 1084 


1000 


ERGTLAR 8 8 0 


RSDELTR 10 85 


1000 


RSDALAR 


881 


NRSHLAR 1086 


50 


RSDALAR 


882 


QASHLAR 10 8 7 


100 


RSDNLAR 


883 


QRGALAR 108 8 


80 


RSDNLAR 


884 


RSDHLSR 1089 


0.3 


RSDNLAR 


885 


RSDHLSR 109 0 


0 . 3 


RSDNLQR 


886 


RSDHLSR 1091 


0 . 8 


RSDHLTR 


887 


DRSHLAR 1092 


0 . 4 


RSDHLSR . 


888 


QSGSLTR 10 93 


1 


QSSHLAR ; 


889 


RSDHLSR 1094 


1 


DRSHLAR i 


390 


RSDALAR 10 95 


75 


QSGNLTR i 


391 


QSGNLRT 10 96 


100 


QSANLTR i 


392 


QSGNLRT 10 97 


235 


RSDHLAR I 


393 


QSGNLTR 10 98 


2 


RSDHLAR i 


394 


QSGNLRR 10 99 


2 


ERGTLAR i 


395 


RSDERKR 110 0 


90 


DRSSLTR i 


396 


RSDERKR 1101 


107 


ERGTLAR £ 


397 


RSDERKR 1102 


190 


DRSSLTR E 


198 


RSDERKR 1103 


260 


ERGTLAR £ 


199 


RSDERKR 1104 


160 


DRSSLTR S 


)00 


RSDERKR 110 5 


225 


QSANLTR S 


)01 


QSADLAR 110 6 


1000 


QSGNLTR 9 02 


QSGSLTR 1107 


2 


RSDHLTT 9 03 


RSDERKR 1108 


1000 


RSDELTR 904 


SRDHLQS 1109 


1000 


QSANLTR 9 05 


QSGSLTR 1110 


1000 


QSANLTR 9 06 


QSGSLTR 1111 


1000 


QSGNLQR 9 


'07 


QSGSLTR 1112 




QSANLQR 9 


08 


QSADLAR 1113 


1000 


QSANLQR 90 9 


QSGSLTR 1114 


1000 


ERHSLQT 910 


RSDALTR 1115 


320 


RSDELTR 911 


DRDHLQS 1116 


1000 
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297 


TGCGCGGCC 


502 


ERGTLAR 70 7 


RSDELRR 912 


DRSHLQT 1117 


500 


298 


GCTTAGGCA 


503 


QTGELRR 70 8 


RSDNLQK 913 


TSGDLSR 1118 


4000 


299 


GCTTAGGCA 


504 


QTSDLRR 70 9 


RSDNLQK 914 


QSSDLQR 1119 


4000 


300 


GCTTAGGCA 


505 


QTADLRR 710 


RSDNLQR 915 


QSSDLSR 1120 


400 


301 


GCTTAGGCA 


506 


QSADLRR 711 


RSDNLQT 916 


QSSDLSR 1121 


350 


302 


GCTTAGGCA 


507 


QSGSLTR 712 


RSDNLQT 917 


QSSDLSR 1122 


75 


303 


GCTTAGGCA 


508 


QTGSLTR 713 


RSDNLQT 918 


QSSDLSR 1123 


135 


304 


GCTTAGGCA 


509 


QTADLTR 714 


RSDNLQT 919 


QSSDLSR 1124 


230 


305 


GCTTAGGCA 


510 


QTGDLTR 715 


RSDNLQT 92 0 


QSSDLSR 1125 


230 


306 


GCTTAGGCA 


511 


QTASLTR 716 


RSDNLQT 921 


QSSDLSR 1126 


280 


307 


GAAGAAGCG 


512 


RSDELRR 717 


QSGNLQR 92 2 


QSGNLSR 112 7 


50 . 5 


308 


GAAGAAGCG 


513 


RSDELRR 718 


QSANLQR 923 


QSANLQR 112 8 


1000 


309 


GGAGATGCC 


514 


ERSDLRR 719 


QSSNLQR 924 


QSGHLSR 112 9 


4000 


310 


GGAGATGCC 


515 


DRSDLTR 72 0 


NRDNLQT 925 


QSGHLSR 113 0 


1000 


311 


GGAGATGCC 


516 


DRSTLTR 721 


NRDNLQR 926 


QSGHLSR 1131 


170 


312 


GGAGATGCC 


517 


ERGTLAR 722 


NRDNLQR 92 7 


QSGHLSR 1132 


2000 


313 


GGAGATGCC 


518 


DRSDLTR 723 


QRSNLQR 92 8 


QSGHLSR 113 3 


1000 


314 


GGAGATGCC 


519 


DRSSLTR 724 


QSSNLQR 92 9 


QSGHLSR 1134 


117 . 5 


315 


GGAGATGCC 


520 


ERGTLAR 72 5 


QSSNLQR 93 0 


QSGHLSR 113 5 


265 


316 


GGAGATGCC 


521 


ERGTLAR 72 6 


QRDNLQR 931 


QSGHLSR 113 6 


3000 


318 


TAGGAGATGC 


522 


RSDALTS 72 7 


RSDNLAR 93 2 


RSDNLAS 113 7 


100 


319 


GGGGAAGGG 


523 


KTSHLRA 72 8 


QSGNLSR 93 3 


RSDHLSR 113 8 


125 


320 


GGGGAAGGG 


524 


RSDHLTR 72 9 


QSGNLSR 93 4 


RSDHLSR 113 9 


5 


321 


GGCGGAGAT 


525 


TTSNLRR 73 0 


QSGHLQR 93 5 


DRSHLTR 114 0 


200 


323 


GGCGGAGAT 


526 


TTSNLRR 731 


QSGHLQR 93 6 


DRDHLTR 1141 


600 


324 


GGCGGAGAT 


527 


TTSNLRR 73 2 


QSGHLQR 93 7 


DRDHLTR 1142 


200 


325 


GTATCTGCT 


528 


NSSDLTR 73 3 


NSDVLTS 93 8 


QSDVLTR 1143 


1000 


326 


GTATCTGTT 


529 


NSDALTR 73 4 


NSDVLTS 93 9 


QSDVLTR 1144 


1000 


3 2 7 


TCTGCTGGG 


53 0 


RSDHLTR 73 5 


NSADLTR 94 0 


NSDDLTR 1145 


1000 


328 


TCTGTTGGG 


531 


RSDHLTR 73 6 


NS SALTS 941 


NSDDLTR 1146 


1000 


349 


GGTGTCGCC 


532 


DCRDLAR 73 7 


DSGSLTR 94 2 


TSGHLTR 114 7 


1000 


350 


TCCGAGGGT 


533 


TSGHLTR 73 8 


RSDNLTR 94 3 


DCRDLTT 114 8 


332 


351 


GCTGGTGTC 


534 


DSGSLTR 73 9 


TSGHLTR 944 


TLHTLTR 114 9 


1000 



352 GGAGGGGTG 535 RSDSLLR 74 0 
3 53 GTTGGAGCC 53 6 DCRDLAR 741 
354 GAAGAGGAC 537 DSSNLTR 742 
3 55 GAAGAGGAC 538 EKANLTR 743 
3 5 6 GGCTGGGCG 53 9 RSDELRR 744 

357 GGCTGGGCG 540 RSDELRR 745 

358 GGCTGGGCG 541 RSDELRR 74 6 
3 61 GGGTTTGGG 54 2 RSDHLTR 747 
3 63 GGGTTTGGG 543 RSDHLTR 74 8 
3 64 GTGTCCGAAG 544 RSDNLTR 74 9 
3 65 GGTGCTGGT 54 5 QASHLTR 75 0 
366 GAGGGTGCT 54 6 QASVLTR 751 
3 67 GGGGGCGGG 54 7 RSDHLTR 7 52 
3 68 GAGGGGGCG 54 8 RSDELTR 753 
369 GTAGTTGTG 549 RSDALTR 7 54 
3 70 GTAGTTGTG 550 RSDALTR 755 
3 71 GTAGTTGTG 551 RSDALTR 75 6 

372 GTAGTTGTG 552 RSDSLLR 757 

373 GTAGTTGTG 553 RSDSLLR 758 
3 74 GCTGAGGAA 554 QRSNLVR 759 
3 75 GAGGAAGAT 555 QQ SNLAR 760 
377 GTGTTGGCAG 556 QSGSLTR 761 
3 78 GCCGAGGAGA 55 7 RSDNLTR 762 
3 79 GCCGAGGAGA 55 8 RSDNLTR 763 

380 GAGTCGGAAG 559 QSANLAR 7 64 

381 GCAGCTGCGC 560 RSDELTR 765 
3 83 TGGTTGGTAT 5 61 QSATLAR 7 66 

384 GTGGGCTTCA 562 DRSALTT 767 

385 GGGGCGGAGC 5 63 RSDNLTR 768 
3 86 GGGGCGGAGC 5 64 RSDNLTR 769 
387 GGCGAGGCAA 565 QSGSLTR 770 
3 88 GGCGAGGCAA 5 66 QSGDLTR 771 
390 GTGGCAGCGG 5 67 RSDTLKK 7 72 
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RSDHLTR 945 


QSDHLTR 115 0 


26 


QSDHLTR 94 6 


TSGALTR 1151 


1000 


RSDNLTR 94 7 


QRSNLVR 1152 


28 


RSDNLTR 94 8 


QRSNLVR 1153 


20 


RSDHLTK 94 9 


DSDHLSR 1154 


1000 


RSDHLTK 95 0 


DSDHLSR 1155 


1000 


RSDHLTK 951 


DSSHLSR 1156 


225 


QSSALTR 952 


RSDHLTR 115 7 


130 


QSSVLTR 953 


RSDHLTR 1158 


200 


DSAVLTT 954 


RSDSLTR 1159 


1000 


QASVLTR 955 


QASHLTR 1160 


600 


QASHLTR 95 6 


RSDNLTR 1161 


1000 


DSGHLTR 957 


RSDHLQR 1162 


60 


RSDHLTR 958 


RSDNLTR 1163 


3 . 5 


TGGSLAR 95 9 


QSGSLTR 1164 


95 


NRATLAR 960 


QSASLTR 1165 


300 


NRATLAR 961 


QSGSLTR 1166 


175 


TGGSLAR 962 


QSASLTR 1167 


112 . 5 


NRATLAR 963 


QSASLTR 1168 


320 


RSDNLTR 964 


TSSELQR 1169 


3 . 3 


QSGNLQR 9 65 


RSDNLTR 117 0 


85 


RGDALTS 966 


RSDALTR 1171 


89 


RSDNLTR 967 


DRSSLTR 1172 


31 


RSDNLTR 968 


ERGTLAR 1173 


3 


RSDELTT 969 


RSDNLAR 1174 


1000 


QSSDLQR 97 0 


QSGDLTR 1175 


1 . 5 


RGDALTS 971 


RSDHLTT 1176 


1000 


DRSHLAR 972 


RSDALAR 1177 


60 


RSDTLKK 973 


PCTlUT CD 1 1 'I Q 




RSDELQR 974 


RSDHLSR 117 9 


0.4 


RSDNLAR 975 


DRSHLAR 1180 


2 . 5 


RSDNLAR 97 6 


DRSHLAR 1181 


28 


QSSDLQK 97 7 


RSDALAR 1182 


20 
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3 92 GTGGCAGCGG 568 RSDELTR 7 73 QSSDLQK 978 RSDALAR 1183 1000 

3 96 GCGGGAGCAG 5 69 QSGSLTR 774 QSGHLQR 97 9 RSDTLKK 1184 18.8 
397 GCGGGAGCAG 570 QSGDLTR 775 QSGHLQR 980 RSDTLKK 1185 25 
400 TCAGTGGTGG 571 RSDALAR 776 RSDSLAR 981 QSGDLRT 1186 40 

4 05 GCGGCCGCA 5 72 RSDELTR 77 7 ERGTLAR 982 RSDERKR 1187 110 

406 GCGGCCGCA 573 RSDELTR 778 DRSSLTR 983 RSDERKR 1188 110 

407 GCGGCCGCA 574 QSWELTR 77 9 ERGTLAR 984 RSDERKR 118 9 410 

408 GCGGCCGCA 575 QSWELTR 78 0 DRSSLTR 985 RSDERKR 1190 380 

409 GCGGCCGCA 576 QSGSLTR 781 ERGTLAR 98 6 RSDERKR 1191 50 

410 GCAGAAGTC 577 RSDALTR 782 QSGNLTR 987 QSGSLTR 1192 3 

411 GCGGCCGCA 578 QSGSLTR 783 RSDHLTT 988 RSDERKR 1193 1000 

412 GCGTGGGCG 579 QSGSLTR 784 RSDHLTT 989 RSDERKR 1194 5 

413 GCGTGGGCA 58 0 QSGSLTR 785 RSDHLTT 990 RSDERKR 1195 5 

414 GCAGAAGCA 581 RSDELTR 78 6 QSANLQR 991 QSGSLTR 1196 1000 

415 GTGTGCGGA 582 DRSHLTR 787 ERHSLQT 992 RSDALTR 1197 1000 

416 TGTGCGGCC 583 ERGTLAR 7 88 RSDELRR 9 93 DRSHLQT 1198 100 0 
4 93 GGGGTGGCGG 584 RSDTLKK 789 RSDSLAR 9 94 RSDHLSR 1199 300 

4 94 GCCGAGGAGA 585 RSDNLTR 790 RSDNLTR 9 95 DRSSLTR 1200 90 

496 GGTGGTGGC 586 DTSHLRR 791 TSGHLQR 9 96 TSGHLSR 12 01 1000 

497 GTTTGCGTC 587 ETASLRR 792 DSAHLQR 997 TSSALSR 12 02 1000 

498 GAAGAGGCA 588 QTGELRR 793 RSDNLQR 9 98 QSGNLSR 12 03 30 

499 GCTTGTGAG 58 9 RTSNLRR 794 TSSHLQK 999 DTDHLRR 12 04 1000 

500 GCTTGTGAG 590 RSDNLTR 795 QSSNLQT 1000 DRSHLAR 12 0 5 1000 

501 GTGGGGGTT 591 NRATLAR 796 RSDHLSR 1001 RSDALAR 12 06 8 

502 GGGGTGGGA 592 QSAHLAR 7 97 RSDALAR 10 02 RSDHLSR 1207 60 

5 0 7 GAGGTAGAGG 5 93 RSDNLAR 7 98 QRSALAR 10 03 RSDNLAR 12 08 10 

508 GAGGTAGAGG 5 94 RSDNLAR 799 QSATLAR 10 04 RSDNLAR 12 0 9 10 

509 GTCGTGTGGC 5 95 RSDHLTT 800 RSDALAR 1005 DRS ALAR 1210 100 

510 GTTGAGGAAG 5 96 QSGNLAR 8 01 RSDNLAR 10 06 NRATLAR 1211 100 

511 GTTGAGGAAG 5 97 QSGNLAR 802 RSDNLAR 1007 QSSALAR 1212 100 

512 GAGGTGGAAG 598 QSGNLAR 803 RSDALAR 10 0 8 RSDNLAR 1213 10 

513 GAGGTGGAAG 5 99 QSANLAR 8 04 RSDALAR 10 0 9 RSDNLAR 1214 1.5 

514 TAGGTGGTGG 60 0 RSDALTR 805 RSDALAR 1010 RSDNLTT 1215 10 



515 TGGGAGGAGT 601 RSDNLTR 806 

516 GGAGGAGCT 602 TTSELRR 8 07 

517 GGAGCTGGGG 603 RTDHLRR 8 08 

518 GGGGGAGGAG 6 04 QTGHLRR 809 

519 GGGGAGGAGA 6 05 RSDMLAR 810 
52 0 GGAGGAGAT 606 TTANLRR 811 

521 GCAGCAGGA 60 7 QTGHLRR 812 

522 GATGAGGCA 608 QTGELRR 813 
52 7 GGGGAGGATC 60 9 TTSNLRR 814 
52 8 GGGGAGGATC 610 TTSNLRR 815 

52 9 GAGGCTTGGG 611 RTDHLRK 816 

531 GCGGAGGCTT 612 TTGELRR 817 

532 GCGGAGGCTT 613 QSSDLQR 818 

53 3 GCGGAGGCTT 614 QSSDLQR 819 
534 GCGGAGGCTT 615 QSSDLQR 820 
53 5 GCAGCCGGG 616 RTDHLRR 821 
538 GCAGAGGCTT 617 QSSDLQR 822 

540 TGGGCAGGCC 618 DRSHLTR 82 3 

541 GGGGAGGAT 619 TTSNLRR 82 4 
5 70 GGGGAAGGCT 62 0 DSGHLTR 825 

571 GTGTGTGTGT 621 RSDSLTR 82 6 

572 GCATACGTGG 622 RSDSLLR 82 7 

573 GCATACGTG 623 RSDSLLR 82 8 

574 TACGTGGGGT 624 RSDHLTR 82 9 

575 TACGTGGGCT 625 DFSHLTR 83 0 

576 GAGGGTGTTG 62 6 NSDTLAR 8 31 

577 GGAGCGGGGA 62 7 RSDHLSR 832 
579 GGGGTTGAGG 628 RSDNLTR 83 3 
58 0 GGTGTTGGAG 62 9 QRAHLAR 834 
581 TACGTGGGTT 630 QSSHLTR 835 

583 GTAGGGGTTG 631 NSSALTR 83 6 

584 GAAGGCGGAG 632 QAGHLTR 83 7 
5 8 5 GAAGGCGGAG 63 3 QAGHLTR 83 8 
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RSDNLTR 1011 RSDHLTT 1216 0.5 
QSGHLQR 1012 QSGHLSR 1217 700 
TSSELQR 1013 QSGHLSR 1218 50 
QSGHLQR 1014 RSDHLSR 1219 30 
RSDNLSR 1015 RSDHLSR 122 0 0.3 
QSGHLQR 1016 QSGHLSR 1221 300 

QSGELQR 1017 QSGELSR 12 2 2 1000 

RSDNLQR 1018 TSANLSR 12 2 3 200 
RSSNLQR 1019 RSDHLSR 1224 2 
RSSNLQR 102 0 RSDHLSR 12 2 5 10 

TSAELQR 1021 RSSNLSR 122 6 1000 

RSSNLQR 1022 RSDELSR 1227 160 

RSSNLQR 1023 RSDELSR 1228 100 
RSDNLAR 1024 RSADLSR 122 9 7 
RSDNLAR 102 5 RSDDLRR 12 3 0 10 

ESSDLQR 1026 QSGELSR 1231 1000 
RSDNLAR 1027 QSGSLTR 1232 70 
QSGSLTR 1028 RSDHLTT 1233 55 
RSSNLQR 1029 RSDHLSR 1234 3 
QRSNLVR 103 0 RSDHLTR 123 5 2 0 

QRSNLVR 1031 RSDSLLR 123 6 10 0 0 

DKGNLQS 1032 QSDDLTR 1237 1000 

DKGNLQS 1033 QSGDLTR 123 8 1000 
RSDHLTR 1034 DKGNLQT 12 3 9 2 5 

RSDHLTR 1035 DKGNLQT 124 0 472 

TSGHLTR 103 6 RSDNLTR 1241 200 

RSDELQR 103 7 QSDHLTR 1242 200 

NRDTLAR 103 8 TSGHLTR 124 3 2 00 

NRDTLAR 10 3 9 TSGHLTR 12 4 4 1000 

RSDSLLR 104 0 DKGNLQT 12 4 5 3 82 

RSDHLTR 1041 QSASLTR 12 4 6 46 

DKSHLTR 1042 QSGNLTR 124 7 1000 

DSGHLTR 1043 QSGNLTR 124 8 1000 



587 GGGGGTTACG 634 DKGNLQT 83 9 
58 8 GGGGGGGGGG 63 5 RSDHLSR 84 0 
58 9 GGAGTATGCT 63 6 DSGHLAS 841 
5 95 TGGTTGGTAT 63 7 QRGSLAR 842 

597 TGGTTGGTA 63 8 QNSAMRK 84 3 

598 TGGTTGGTA 63 9 QRGSLAR 844 

599 TGGTTGGTA 640 QNSAMRK 845 

600 GAGTCGGAA 641 QSANLAR 84 6 

601 GAGTCGGAA 642 RSANLTR 84 7 

602 GAGTCGGAA 643 RSANLTR 84 8 

603 GAGTCGGAA 644 QSGNLAR 84 9 
606 GGGGAGGATC 64 5 TTSNLRR 85 0 



31 



TSGHLTR 1044 


RSDHLSK 1249 


500 


RSDHLTR 1045 


RSDHLSK 1250 


30 


QSATLAR 1046 


QSDHLTR 12 51 


1000 


RGDALTR 104 7 


RSDHLTT 12 52 


73 .3 


RGDALTS 1048 


RSDHLTT 1253 


1000 


RDGSLTS 1049 


RSDHLTT 1254 


1000 


RDGSLTS 1050 


RSDHLTT 1255 


1000 


RSDELRT 10 51 


RSDNLAR 12 56 


206.7 


RLDGLRT 10 52 


RSDNLAR 12 57 


606 . 7 


RQDTLVG 1053 


RSDNLAR 12 58 


616 . 7 


RSDELRT 1054 


RSDNLAR 12 59 


166 . 7 


RSDNLQR 1055 


RSDHLSR 12 60 


0 . 2 
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TABLE 3 





SEP 


SEQ 


SEP 


SEP 


Kd 


SBS# 


TARGET ID 


Fl ID 


F2 ID 


F3 ID 


(nM) 


897 


GAGGAGGTGA 12 61 


RSDALAR 1347 


RSDNLAR 1433 


RSDNLVR 1519 


0 . 


07 


828 


GCGGAGGACC 12 62 


EKANLTR 134 8 


RSDNLAR 1434 


RSDERKR 152 0 


0 


. 1 


884 


GAGGAGGTGA 12 63 


RSDSLTR 1349 


RSDNLAR 14 3 5 


RSDNLVR 1521 


0 . 


15 


817 


GAGGAGGTGA 12 64 


RSDSLTR 1350 


RSDNLAR 14 3 6 


RSDNLAR 15 2 2 


0 . 


31 


666 


GCGGAGGCGC 1265 


RSDDLTR 1351 


RSDNLTR 14 3 7 


RSDTLKK 152 3 


0 


. 5 


829 


GCGGAGGACC 12 66 


EKANLTR 13 52 


RSDNLAR 143 8 


RSDTLKK 1524 


0 . 


52 


670 


GACGTGGAGG 12 67 


RSDNLAR 13 53 


RSDALAR 14 3 9 


DRSNLTR 152 5 


0 . 


57 


801 


AAGGAGTCGC 12 68 


RSADLRT 13 54 


RSDNLAR 144 0 


RSDNLTQ 152 6 


0 . 


85 


668 


GTGGAGGCCA 12 69 


ERGTLAR 13 55 


RSDNLAR 1441 


RSDALAR 152 7 


1 . 


13 


895 


ATGGATTCAG 12 70 


QSHDLTK 13 5 6 


TSGNLVR 1442 


RSDALTQ 152 8 


1 . 


.4 


799 


GGGGGAGCTG 12 71 


QSSDLQR 13 5 7 


QRAHLER 14 4 3 


RSDHLSR 152 9 


1 . 


85 


798 


GGGGGAGCTG 12 72 


QSSDLQR 13 58 


QSGHLQR 144 4 


RSDHLSR 153 0 


3 


842 


GAGGTGGGCT 12 73 


DRSHLTR 13 59 


RSDALAR 14 4 5 


RSDNLAR 1531 


5 . 


. 4 


894 


TCAGTGGTAT 12 74 


QRSALAR 13 6 0 


RSDALSR 144 6 


QSHDLTK 1532 


6 . 


15 


892 


ATGGATTCAG 12 75 


QSHDLTK 13 61 


QQSNLVR 144 7 


RSDALTQ 153 3 


6 . 


,2 


888 


TCAGTGGTAT 12 7 6 


QSSSLVR 13 62 


RSDALSR 144 8 


QSHDLTK 1534 


14 


739 


GCGGGCGGGC 12 7 7 


RSDHLTR 13 63 


ERGHLTR 14 4 9 


RSDDLRR 153 5 


16 


. 5 


850 


CAGGCTGTGG 12 78 


RSDALTR 13 64 


QSSDLTR 1450 


RSDNLRE 153 6 


17 


797 


GCAGAGGCTG 12 7 9 


QSSDLQR 13 65 


RSDNLAR 14 51 


QSGDLTR 153 7 


17 


. 5 


891 


TCAGTGGTAT 12 8 0 


QSSSLVR 1366 


RSDALSR 1452 


QSGSLRT 1538 


18 


. 5 


887 


TCAGTGGTAT 12 81 


QRSALAR 13 6 7 
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23 . 


75 


672 


TCGGACGTGG 12 8 2 


RSDALAR 13 6 8 


DRSNLTR 14 54 


RSDELRT 154 0 


24 


836 


GGGGAGGCCC 12 8 3 


ERGTLAR 13 69 


RSDNLAR 1455 


RSDHLSR 1541 


24 . 


25 


674 


GCGGCGTCGG 12 84 


RSDELRT 13 7 0 


RADTLRR 14 5 6 


RSDTLKK 1542 


27 


. 5 


849 


GGGGCCCTGG 12 8 5 


RSDALRE 13 71 


DRSSLTR 1457 


RSDHLTQ 1543 


29 . 


05 


825 


GAATGGGCAG 12 8 6 


QSGSLTR 1372 


RSDHLTT 14 5 8 


QSGNLTR 1544 


37 


. 3 


673 


GCGGGTGTCT 12 87 


DRSALAR 13 73 


QSSHLAR 145 9 


RSDTLKK 1545 


48 . 


33 


848 


GGGGAGGCCC 12 88 


DRSSLTR 1374 


RSDNLAR 146 0 


RSDHLSR 1546 


49 


.5 
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662 


AGAGCGGCAC 


1289 


QTGSLTR 13 75 


RSDELQR 14 61 


QSGHLNQ 


1547 


50 


667 


GAGTCGGACG 


1290 


DRSNLTR 13 7 6 


RSDELRT 14 62 


RSDNLAR 


1548 


50 


803 


GCAGCGGCTC 


1291 


QSSDLQR 13 7 7 


RSDELQR 14 63 


QSGSLTR 


1549 


57 . 5 


671 


TCGGACGAGT 


1292 


RSDNLAR 13 7 8 


DRSNLTR 14 64 


RSDELRT 


1550 


64 


851 


GAGATGGATC 


1293 


QSSNTjQR 13 7 9 


RRDVLMN 14 65 


RLHNLQR 


1551 


74 


804 


GCAGCGGCTC 


1294 


QSSDLQR 13 8 0 


RSDDLNR 14 66 


QSGSLTR 


1552 


82 . 5 


669 


GACGAGTCGG 


1295 


RSDELRT 13 81 


RSDNLAR 14 67 


DRSNLTR 


1553 


90 


682 


GCTGCAGGAG 


1296 


RSDHLAR 13 82 


QSGDLTR 14 68 


QSSDLSR 


1554 


90 


845 


GAGATGGATC 


12-97 


QSSNLQR 13 83 


RSDALRQ 14 69 


RLHNLQR 


1555 


112 . 5 


663 


AGAGCGGCAC 


1298 


QTGSLTR 13 84 


RSDELQR 14 70 


KNWKLQA 


1556 


115 


738 


GCGGGGTCCG 


1299 


ERGTLTT 13 8 5 


RSDHLSR 14 71 


RSDDLRR 


1557 


120 


664 


AGAGCGGCAC 


1300 


QTGSLTR 13 8 6 


RADTLRR 14 72 


ASSRLAT 


1558 


125 


833 


GACTAGGACC 


1301 


EKANLTR 13 8 7 


RSDNLTK 14 73 


DRSNLTR 


1559 


136 


685 


GCTGCAGGAG 


1302 


RSDHLAR 13 8 8 


QSGSLTR 1474 


QSSDLSR 


1560 


150 


835 


TAGGGAGCGT 


1303 


RADTLRR 13 8 9 


QSGHLTR 14 75 


RSDNLTT 


1561 


150 


847 


TAGGGAGCGT 


1304 


RSDDLTR 13 9 0 


QSGHLTR 147 6 


RSDNLTT 


1562 


150 


818 


GAATGGGCAG 


1305 


QSGSLTR 1391 


RSDHLTT 1477 


QSSNLVR 


1563 


167 


834 


GACTAGGACC 


1306 


EKANLTR 13 92 


RSDHLTT 14 7 8 


DRSNLTR 


1564 


186 


837 


GGGGCCCTGG 


1307 


RSDALRE 13 93 


DRSSLTR 1479 


RSDHLSR 


1565 


222 


764 


GCAGAGGCTG 


1308 


TSGELVR 13 94 


RSDNLAR 14 8 0 


QSGDLTR 


1566 


255 


774 


GCAGCGGTAG 


1309 


QRSALAR 13 95 


RSDELQR 1481 


QSGDLTR 


1567 


258 


765 


GCCGAGGCCG 


1310 


ERGTLAR 13 96 


RSDNLAR 1482 


ERGTLAR 


1568 


262 . 5 


766 


GCCGAGGCCG 


1311 


ERGTLAR 13 97 


RSDNLAR 1483 


DRSDLTR 


1569 


262 . 5 


775 


GCAGCGGTAG 


1312 


QSGALTR 13 98 


RSDELQR 1484 


QSGDLTR 


1570 


265 


763 


GCAGAGGCTG 


1313 


TSGELVR 13 99 


RSDNLAR 14 85 


QSGSLTR 


1571 


275 


838 


GGGGCCCTGG 


1314 


RSDALRE 14 0 0 


DRSSLTR 1486 


RSDHLTA 


1572 


300 


841 


GAGTGTGAGG 


1315 


RSDNLAR 14 01 


QSSHLAS 1487 


RSDNLAR 


1573 


300 


770 


TTGGCAGCCT 


1316 


DRSSLTR 1402 


QSGSLTR 1488 


RSDSLTK 


1574 


325 


767 


GGGGGAGCTG 


1317 


QSSDLAR 14 03 


QSGHLQR 1489 


RSDHLSR 


1575 


335 


800 


TTGGCAGCCT 


1318 


ERGTLAR 14 04 


QSGSLTR 1490 


RSDSLTK 


1576 


400 


832 


GACTAGGACC 


1319 


EKANLTR 14 05 


RSDNLTT 14 91 


DRSNLTR 


1577 


408 


844 


GAGATGGATC 


1320 


QSSNLQR 14 06 


RSDALRQ 14 92 


RSDNLQR 


1578 


444 


683 


GCTGCAGGAG 


1321 


QSGHLAR 14 07 


QSGSLTR 1493 


QSSDLSR 


1579 


500 



8 05 GCAGCGGTAG 13 2 2 

83 9 GAGTGTGAGG 13 2 3 

84 0 GAGTGTGAGG 13 24 

83 0 GGAGAGTCGG 132 5 
831 GGAGAGTCGG 132 6 
684 GCTGCAGGAG 1327 

84 6 GAGATGGATC 132 8 
819 AAGTAGGGTG 132 9 
82 0 ACGGTAGTTA 133 0 
821 ACGGTAGTTA 1331 
82 2 GTGTGCTGGT 1332 
82 3 GTGTGCTGGT 1333 
824 GTGTGCTGGT 13 34 
8 85 GTGTGCTGGT 13 3 5 
886 TCAGTGGTAT 133 6 
88 9 ATGGATTCAG 13 3 7 
890 CTGGTATGTC 13 3 8 
8 96 AAGTAGGGTG 13 3 9 
8 98 ACGGTAGTTA 134 0 
8 99 CTGGTATGTC 1341 
90 0 CTGGTATGTC 134 2 
901 CTGGTATGTC 134 3 
773 GCAGCGGTAG 1344 
768 GGGGGAGCTG 13 4 5 
681 GCTGCAGGAG 134 6 
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QRSALAR 14 08 


RSDELQR 14 94 


QSGSLTR 1580 


500 


RSDNLAR 14 0 9 


TSDHLAS 14 95 


RSDNLAR 15 81 


625 


RSDNLAR 1410 


MSHHLKT 14 96 


RSDNLAR 1582 


625 


RSDELRT 1411 


RSDNLAR 14 97 


QRAHLAR 1583 


683 


RSDDLTK 1412 


RSDNLAR 14 98 


QRAHLAR 1584 


700 


RSAHLAR 1413 


QSGSLTR 1499 


QSSDLSR 1585 


850 


QSSNLQR 1414 


RRDVLMN 15 0 0 


RSDNLQR 158 6 


889.5 


QSSHLTR 1415 


RSDNLTT 1501 


RSDNLTQ 158 7 


1000 


QSSALTR 1416 


QRSALAR 15 02 


RSDTLTQ 158 8 


1000 


NRATLAR 1417 


QRSALAR 15 03 


RSDTLTQ 15 8 9 


1000 


RSDHLTT 1418 


ERQHLAT 15 04 


RSDALAR 1590 


1000 


RSDHLTK 1419 


ERQHLAT 15 0 5 


RSDALAR 15 91 


1000 


RSDHLTT 142 0 


DRSHLRT 15 0 6 


RSDALAR 15 92 


1000 


RSDHLTK 1421 


DRSHLRT 15 0 7 


RSDALAR 15 93 


1000 


QSSSLVR 1422 


RSDALSR 15 0 8 


QSGDLRT 15 94 


1000 


QSGSLTT 1423 


QQSNLVR 15 0 9 


RSDALTQ 15 9 5 


1000 


QRSHLTT 1424 


QRSALAR 1510 


RSDALRE 15 9 6 


1000 


TSGHLVR 142 5 


RSDNLTT 1511 


RSDNLTQ 15 97 


1000 


NRATLAR 14 2 6 


QSSSLVR 1512 


RSDTLTQ 1598 


1000 


QRSHLTT 142 7 


QSSSLVR 1513 


RSDALRE 15 9 9 


1000 


MSHHLKE 14 2 8 


QSSSLVR 1514 


RSDALRE 16 0 0 


10 0 0 


MSHHLKE 142 9 


QRSALAR 1515 


RSDALRE 1601 


1000 


QSGALTR 14 3 0 


RSDELQR 1516 


QSGSLTR 1602 


1250 


QSSDLAR 1431 


QRAHLER 1517 


RSDHLSR 1603 


2000 


RSAHLAR 143 2 


QSGDLTR 1518 


QSSDLSR 1604 


3000 



TABLE 4 





SEP 


Fl 


SEP 


F2 


SEP 


F3 SEP_ 


Kd 


SBS# 


TARGET IB 




ID 




ID 


ID 


(nM) 


607 


7AAGGTGGCAG 160 5 


QSGDLTR 


1707 


RSDSLAR 


1809 


RLDNRTA 1911 


6.5 


608 


TTGGCTGGGC 160 6 


GSWHLTR 


1708 


QSSDLQR 


IS 


310 


RSDSLTK 1912 


8 


611 


GTGGCTGCAG 1607 


QSGDLTR 


1709 


QSSDLQR 


11 


311 


RSDALAR 1913 


11 . 5 


612 


GTGGCTGCAG 16 0 8 


QSGTLTR 


1710 


QSSDLQR 


11 


312 


RSDALAR 1914 


0.38 


613 


TTGGCTGGGC 160 9 


RSDHLAR 


1711 


QSSDLQR 


11 


313 


RGDALTS 1915 


1 .45 


614 


TTGGCTGGGC 1610 


RSDHLAR 


1712 


QSSDLQR 


li 


314 


RSDSLTK 1916 


2 


616 


GAGGAGGATG 1611 


QSSNLQR 


1713 


RSDNLAR 


li 


315 


RSDNLQR 1917 


0.08 


617 


AAGGGGGGG 1612 


RSDHLSR 


1714 


RSDHLTR 


li 


316 


RKDNMTA 1918 


1 


618 


AAGGGGGGG 1613 


RSDHLSR 


1715 


RSDHLTR 


11 


317 


RKDNMTQ 1919 


0 . 55 


619 


AAGGGGGGG 1614 


RSDHLSR 


1716 


RSDHLTR 


li 


318 


RKDNMTN 192 0 


1 . 34 


620 


AAGGGGGGG 1615 


RSDHLSR 


1717 


RSDHLTR 


li 


319 


RLDNRTA 1921 


0 . 54 


621 


AAGGGGGGG 1616 


RSDHLSR 


1718 


RSDHLTR 


li 


320 


RLDNRTQ 1922 


0 . 75 


624 


ACGGATGTCT 1617 


DRSALAR 


1719 


TSANLAR 


li 


321 


RSDTLRS 1923 


7 


628 


TTGTAGGGGA 1618 


RSDHLTR 


1720 


RSDNLTT 


1822 


RGDALTS 1924 


130 


629 


TTGTAGGGGA 1619 


RSSHLTR 


1721 


RSDNLTT 


li 


323 


RGDALTS 192 5 


150 


630 


CGGGGAGAGT 162 0 


RSDNLAR 


1722 


QSGHLQR 


1824 


RSDHLRE 192 6 


37 . 5 


646 


TTGGTGGAAG 1621 


QSGNLAR 


1723 


RSDALAR 


1825 


RGDALTS 192 7 


35 


647 


TTGGTGGAAG 1622 


QSANLAR 


1724 


RSDALAR 


IE 


326 


RGDALTS 192 8 


40 


651 


GTTGTGGAAT 162 3 


QSGNLSR 


1725 


RSDALAR 


IE 


327 


NRATLAR 192 9 


67 . 5 


652 


TAGGAGGCTG 162 4 


QSSDLQR 


1726 


RSDNLAR 


IE 


328 


RSDNLTT 193 0 


1 . 5 


653 


± ^Ovjri-sAjTO^ 1 _L O Z 3 








IE 


529 


RSDNLTT 1931 




654 


TAGGCATAAA 162 6 


QSGNLRT 


1728 


QSGSLTR 


IE 


530 


RSDNLTT 1932 


105 


655 


TAGGCATAAA 162 7 


QSGNLRT 


1729 


QSSTLRR 


IE 


531 


RSDNLTT 1933 


1000 


656 


TAGGCATAAA 162 8 


QSGNLRT 


1730 


QSGSLTR 


IE 


532 


RSDNLTS 1934 


540 


657 


TAGGCATAAA 162 9 


QSGNLRT 


1731 


QSSTLRR 


IE 


533 


RSDNLTS 193 5 


300 


660 


GAGGGAGTTC 163 0 


NRATLAR 


1732 


QSGHLTR 


1£ 


134 


RSDNLAR 193 6 


8 .25 


661 


GAGGGAGTTC 1631 


TTSALTR 


1733 


QSGHLTR 


IE 


S35 


RSDNLAR 1937 


1 . 73 


665 


GCGGAGGCGC 1632 


RSDDVTR 


1734 


RSDNLTR 


IE 


S36 


RSDDLRR 193 8 


12 . 5 
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689 


AAGGCGGAGA 1633 


RSDNLTR 


1735 


692 


AAGG CGGAGA 1634 


RSDNLTR 


1736 


693 


AAGGCGGAGA 163 5 


RSDNLTR 


1737 


694 


AAGGCGGAGA 163 6 


RSDNLTR 


1738 


695 


GGGGGCGAGC 163 7 


RSSNLTR 


1739 


697 


TGAGCGGCGG 163 8 


RSDELTR 


1740 


698 


TGAGCGGCGG 163 9 


RSDELTR 


1741 


699 


GCGGCGGCAG 164 0 


QSGSLTR 


1742 


700 


GCGGCGGCAG 1641 


QSGDLTR 


1743 


701 


GCAGCGGAGC 1642 


RSDNLAR 


1744 


702 


GCAGCGGAGC 164 3 


RSDNLAR 


1745 


704 


AAGGTGGCAG 1644 


QSGDLTR 


1746 


705 


GGGGTGGGGC 164 5 


RSDHLAR 


1747 


706 


GGGGTGGGGC 164 6 


RSDHLAR 


1748 


708 


GAGTCGGAA 164 7 


QSANLAR 


•1749 


709 


GAGTCGGAA 164 8 


QSANLAR 


1750 


710 


GAGTCGGAA 164 9 


QSGNLAR 


1751 


711 


GAGTCGGAA 165 0 


QSGNLAR 


1752 


712 


GGTGAGGAGT 1651 


RSDNLAR 


1753 


713 


GGTGAGGAGT 1652 


RSDNLAR 


1754 


714 


TGGGTCGCGG 165 3 


RSDELRR 


1755 


715 


TGGGTCGCGG 1654 


RADTLRR 


1756 


716 


TTGGGAGCAC 1655 


QSGSLTR 


1757 


717 


TTGGGAGCAC 165 6 


QSGSLTR 


1758 


718 


TTGGGAGCAC 1657 


QSGSLTR 


1759 


719 


GGCATGGTGG 1658 


RSDALTR 


1760 


720 


GAAGAGGATG 1659 


TTSNLAR 


1761 


722 


ATGGGGGTGG 166 0 


RSDALTR 


1762 


724 


GGCATGGTGG 1661 


RSDALTR 


1763 




bLJ. Ibilbl 111 IDD/S 






726 


GAAGAGGATG 16 63 


QSSNLAR 


1765 


727 


GCGGTGGCTC 1664 


QSSDLTR 


1766 


728 


GGTGAGGAGT 1665 


RSDNLAR 


1767 



RSDELQR 


11 


B37 


RLDNRTA 193 9 


82 . 5 


RSDELQR 


11 


338 


RSDNLTQ 194 0 


51 


RADTLRR 


11 


339 


RLDNRTA 1941 


95 


RADTLRR 


11 


340 


RSDNLTQ 1942 


28 . 5 


DRSHLAR 


1! 


341 


RSDHLTR 1943 


850 


RSDELSR 


li 


342 


QSGHLTK 1944 


200 


RSDELSR 


1843 


QSHGLTS 1945 


300 


RSDDLQR 


1844 


RSDERKR 194 6 


21.5 


RSDDLQR 


1845 


RSDERKR 1947 


45 


RSDELQR 


IS 


346 


QSGSLTR 1948 


50 . 5 


RSDELQR 


IE 


347 


QSGDLTR 1949 


73 . 5 


RSDSLAR 


IS 


348 


RSDNLTQ 1950 


5 


RSDSLAR 


IS 


349 


RSDHLSR 1951 


0 . 01 


RSDSLLR 


IS 


350 


RSDHLSR 1952 


0 . 05 


RQDTLVG 


IS 


351 


RSDNLAR 1953 


300 


RKDVLVS 


IE 


352 


RSDNLAR 1954 


400 


RLDGLRT 


1£ 


353 


RSDNLAR 1955 


400 


RQDTLVG 1854 


RSDNLAR 195 6 


400 


RSDNLAR 


IE 


355 


MSDHLSR 1957 


9 . 5 


RSDNLAR 


IS 


356 


MSHHLSR 1958 


0 . 15 


DRSALAR 


IE 


557 


RSDHLTT 195 9 


200 


DRSALAR 


1858 


RSDHLTT 1960 


0.46 


QSGHLQR 


1859 


RGDALTS 1961 


200 


QSGHLQR 


IS 


S60 


RSDALTK 1962 


150 


QSGHLQR 


IE 


161 


RSDALTR 19 63 


107.5 


RSDALTS 


IE 


162 


DRSHLAR 1964 


20 


RSDNLAR 


IS 


163 


QSGNLTR 1965 


1 . 6 


RSDHLTR 


ie 


!64 


RSDALRQ 19 66 


0 . 7 


RSDALRQ 


18 


165 


DRSHLAR 19 67 


2 . 5 


QSGHLQK 


18 


166 


QSSDLQR 1968 


3000 


RSDNLAR 


18 


167 


QSGNLTR 19 69 


1 . 5 


RSDALSR 


18 


168 


RSDTLKK 197 0 


0 . 1 


RSDNLAR 


18 


;69 


DSSKLSR 1971 


15 
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729 


GGAGGGGAGT 1666 


RSDNLAR 


1768 


730 


TGGGTCGCGG 1667 


RSDDLTR 


1769 


731 


GTGGGGGAGA 1668 


RSDNLAR 


1770 


732 


GCGGGTGGGG 1669 


RSDHLAR 


1771 


733 


GCGGGTGGGG 1670 


RSDHLAR 


1772 


734 


GGGGCTGGGT 1671 


RSDHLAR 


1773 


735 


GCGGTGGCTC 1672 


QSSDLTR 


1774 


736 


GAGGTGGGGA 1673 


RSDHLAR 


1775 


737 


GGAGGGGAGT 1674 


RSDNLAR 


1776 


740 


AAGGTGGCAG 1675 


QSGSLTR 


1777 


741 


AAGGCTGAGA 1676 


RSDNLTR 


1778 


742 


ACGGGGTTAT 1677 


QRGALAS 


1779 


743 


ACGGGGTTAT 1678 


QRGALAS 


1780 


744 


ACGGGGTTAT 167 9 


QRSALAS 


1781 


745 


ACGGGGTTAT 1680 


QRSALAS 


1782 


746 


CTGGAAGCAT 1681 


QSGSLTR 


1783 


747 


CTATTTTGGG 1682 


RSDHLTT 


1784 


748 


TTGGACGGCG 16 83 


DSGHLTR 


1785 


749 


TTGGACGGCG 1684 


DRSHLTR 


1786 


750 


GAGGGAGCGA 168 5 


RSDELTR 


1787 


751 


GGTGAGGAGT 168 6 


RSDNLAR 


1788 


752 


GAGGTGGGGA 168 7 


RSHHLAR 


1789 


757 


CGGGCGGCTG 1688 


QSSDLRR 


1790 


758 


CGGGCGGCTG 1689 


QSSDLRR 


1791 


759 


TTGGACGGCG 1690 


DSGHLTR 


1792 


760 


TTGGACGGCG 1691 


DRSHLTR 


1793 


761 


GCGGTGGCTC 16 92 


QSSDLQR 


1794 


762 


GCGGTGGCTC 1693 


QSSDLQR 


1795 


776 


ATGGACGGGT 1694 


RSDHLAR 


1796 


777 


-ri X kjkji-i.^kjrkjrkj 1 _L O _7 D 


D CTlTJT AT? 




779 


CGGGGAGCAG 1696 


QSGSLTR 


1798 


780 


CGGGGAGCAG 1697 


QSGSLTR 


1799 


781 


GGGGAGCAGC 1698 


RSSNLRE 


1800 



RSDHLSR 


1: 


870 


QSGHLAR 1972 


1000 


DRSALAR 


li 


871 


RSDHLTT 1973 


1000 


RSDHLSR 


1; 


872 


RSDALAR 1974 


12 


QSSHLAR 


l; 


373 


RSDDLTR 1975 


22 . 5 


QSSHLAR 


1! 


374 


RSDTLKK 1976 


0 . 32 


QSSDLSR 


1875 


RSDHLSR 1977 


0.25 


RSDALSR 


1! 


376 


RSDERKR 1978 


0 . 05 


RSDALSR 


li 


377 


RSDNLSR 1979 


0 .47 


RSDHLSR 


li 


378 


QRGHLSR 198 0 


1000 


RSDALAR 


li 


379 


RSDNRTA 1981 


12 . 5 


QSSDLQR 1{ 


380 


RSDNLTQ 1982 


15 


RSDHLSR 


li 


381 


RSDTLKQ 1983 


29 


RSDHLSR 


li 


382 


RSDTLTQ 1984 


10 


RSDHLSR 


li 


383 


RSDTLKQ 198 5 


8.33 


RSDHLSR 


1884 


RSDTLTQ 198 6 


12 . 5 


QSGNLAR 


1885 


RSDALRE 1987 


2.07 


QSSALRT 


li 


386 


QSGALRE 198 8 


2000 


DRSNLER 


1£ 


587 


RGDALTS 1989 


112.3 


DSSNLTR 


1888 


RGDALTS 1990 


11.33 


QSAHLAR 


IE 


589 


RSDNLAR 1991 


52 


RSDNLAR 


IE 


S90 


NRSHLAR 19 92 


7 


RSDALSR 


IE 


191 


RSDNLSR 1993 


31 


RSDELQR 


18 


192 


RSDHLRE 19 94 


14 . 5 


RADTLRR 


18 


193 


RSDHLRE 1995 


16.5 


DSSNLTR 


18 


194 


RGDALTS 19 96 


37 


DRSNLER 


18 


195 


RGDALTS 19 97 


14 8.5 


RSDALSR 


18 


196 


RSDERKR 1998 


6 


RSDALSR 


18 


197 


RSDTLKK 1999 


18 


DRSNLER 


18 


98 


RSDSLNQ 2 00 0 


0 . 4 


DRSNLTR 


18 


99 


RSDALSA 2 001 


3.4 


QSGHLTR 


19 


00 


RSDHLAE 2 0 02 


0 . 5 


QSGHLTR 


19 


01 


RSDHLRA 2 0 03 


0 . 5 


RSDNLAR 


1902 


RSDHLTR 2 004 


4.25 
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783 TTGGGAGCGG 1699 RSDELTR 1801 QSGHLQR 1903 RGDALTS 2 0 0 5 20 0 0 

785 TTGGGAGCGG 17 0 0 RSDTLKK 18 0 2 QSGHLQR 19 04 RSDALTS 2 0 0 6 5 0 

786 TTGGGAGCGG 17 01 RSDTLKK 1803 QSGHLQR 19 05 RGDALRS 2 0 07 2000 

787 AGGGAGGATG 17 02 QSDNLAR 18 04 RSDNLAR 190 6 RSDHLTQ 2 00 8 4 
82 6 GAGGGAGCGA 17 03 RSDELTR 18 0 5 QSGHLAR 19 07 RSDNLAR 2 00 9 2.75 
827 GAGGGAGCGA 17 04 RADTLRR 18 0 6 QSGHLAR 19 08 RSDNLAR 2 010 1.2 

882 GCGTGGGCGT 17 05 RSDELTR 1807 RSDHLTT 19 0 9 RSDERKR 2 011 0.01 

883 GCGTGGGCGT 17 0 6 RSDELTR 1808 RSDHLTT 1910 RSDERKR 2012 1 
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TABLE 5 







SEP 


SEP 


SEP 


SEP 


Kd 


SBS# 


TARGET 


ID 


Fl ID 


F2 ID 


F3 ID 


(nM) 


903 


ATGGAAGGG 


2013 


RSDHLAR 2 513 


QSGNLAR 3 013 


RSDALRQ 3 513 


1 . 027 


904 


AAGGGTGAC 


2014 


DSSNLTR 2514 


QSSHLAR 3 014 


RSDNLTQ 3514 


1 


905 


GTGGTGGTG 


2015 


RSSALTR 2 515 


RSDSLAR 3 015 


RSDSLAR 3 515 


1 . 15 


908 


AAGGTCTCA 


2016 


QSGDLRT 2 516 


DRSALAR 3 016 


RSDNLRQ 3 516 


50 


909 


GTGGAAGAA 


2017 


QSGNLSR 2517 


QSGNLQR 3 017 


RSDALAR 3 517 


16 .4 


910 


ATGGAAGAT 


2018 


QSSNLAR 2518 


QSGNLQR 3 018 


RSDALAQ 3 518 


0 . 03 


911 


ATGGGTGCA 


2019 


QSGSLTR 2519 


QSSHLAR 3 019 


RSDALAQ 3 519 


0 . 91 


912 


TCAGAGGTG 


2020 


RSDSLAR 2 52 0 


RSDNLTR 3 02 0 


QSGDLRT 3 52 0 


0 . 135 


914 


CAGGAAAAG 


2021 


RSDNLTQ 2 521 


QSGNLAR 3 021 


RSDNLRE 3 521 


1.26 


915 


CAGGAAAAG 


2022 


RSDNLRQ 2 522 


QSGNLAR 3 022 


RSDNLRE 3 52 2 


45.15 


916 


GAGGAAGGA 


2023 


QSGHLAR 2 52 3 


QSGNLAR 3 023 


RSDNLQR 3 52 3 


1.3 


919 


TCATAGTAG 


2024 


RSDNLTT 2 524 


RSDNLRT 3 024 


QSGDLRT 3 524 


250 


920 


GATGTGGTA 


2025 


QSSSLVR 2 52 5 


RSDSLAR 3 02 5 


TSANLSR 3 52 5 


4 


921 


AAGGTCTCA 


2026 


QSGDLRT 2 52 6 


DPGALVR 3 02 6 


RSDNLRQ 3 52 6 


11 


922 


AAGGTCTCA 


2027 


QSHDLTK 2 52 7 


DRSALAR 3 02 7 


RSDNLRQ 3 52 7 


4 


923 


AAGGTCTCA 


2028 


QSHDLTK 2 52 8 


DPGALVR 3 02 8 


RSDNLRQ 3 52 8 


2 


926 


GTGGTGGTG 


2029 


RSDALTR 2 52 9 


RSDSLAR 3 02 9 


RSDSLAR 3 52 9 


7 . 502 


927 


CAGGTTGAG 


2030 


RSDNLAR 2 53 0 


TSGSLTR 3 03 0 


RSDNLRE 3 53 0 


3 . 61 


928 


CAGGTTGAG 


2031 


RSDNLAR 2 531 


QSSALTR 3 031 


RSDNLRE 3 531 


25 


929 


CAGGTAGAT 


2032 


QSSNLAR 2 532 


QSATLAR 3 03 2 


RSDNLRE 3 532 


1.3 


931 


GAGGAAGAG 


2033 


RSDNLAR 2 53 3 


QSSNLVR 3 03 3 


RSDNLAR 3533 


2 


932 


ATGGAAGGG 


2034 


RSDHLAR 2 534 


QSSNLVR 3 034 


RSDALRQ 3 534 


797 


933 


GACGAGGAA 


2035 


QSANLAR 2 53 5 


RSDNLAR 3 03 5 


DRSNLTR 3535 


500 


934 


ATGGAAGAT 


2036 


QSSNLAR 2 53 6 


QSGNLQR 3 03 6 


RSDALTS 3 53 6 


0 . 07 


935 


ATGGGTGCA 


2037 


QSGSLTR 2 53 7 


QSSHLAR 3 03 7 


RSDALTS 353 7 


0 . 91 


937 


GTGGGGGCT 


2038 


QSSDLTR 2 53 8 


RSDHLTR 3 03 8 


RSDSLAR 353 8 


0 . 03 


938 


GTGGGGGCT 


2039 


QSSDLRR 2 53 9 


RSDHLTR 3 03 9 


RSDSLAR 3 53 9 


0 . 049 


939 


GGGGGCTGG 


2040 


RSDHLTT 2 54 0 


DRSHLAR 3 04 0 


RSDHLSK 3 54 0 


0 . 352 
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940 


GGGGGCTGG 


2041 


RSDHLTK 2 541 


DRSHLAR 3 041 


RSDHLSK 3 541 


1 . 5 


941 


GGGGCTGGG 


2042 


RSDHLAR 2 542 


QSSDLRR 3 042 


RSDKLSR 3 542 


0 . 


077 


942 


GGGGCTGGG 


2043 


RSDHLAR 2 54 3 


QSSDLRR 3 043 


RSDHLSK 3543 


0 


. 13 


943 


GGGGCTGGG 


2044 


RSDHLAR 2 544 


TSGELVR 3 044 


RSDKLSR 3 544 


0 . 


067 


944 


GGGGCTGGG 


2045 


RSDHLAR 2 54 5 


TSGELVR 3 04 5 


RSDHLSK 3 54 5 


0 . 


027 


945 


GGTGCGGTG 


2046 


RSDSLTR 2546 


RADTLRR 3 04 6 


MSHHLSR 3 54 6 


0 . 


027 


946 


GGTGCGGTG 


2047 


RSDSLTR 2547 


RSDVLQR 3 047 


MSHHLSR 3 54 7 


0 . 


027 


947 


GGTGCGGTG 


2048 


RSDSLTR 2548 


RSDELQR 3 04 8 


QSSHLAR 3 54 8 


0 . 


013 


948 


GGTGCGGTG 


2049 


RSDSLTR 2549 


RSDVLQR 3 04 9 


QSSHLAR 3 54 9 


0 . 


017 


962 


GAGGCGGCA 


2050 


QSGSLTR 2550 


RSDELQR 3 050 


RSDNLAR 3 55 0 


0 . 


015 


963 


GAGGCGGCA 


2051 


QSGSLTR 2551 


RSDDLQR 3 0 51 


RSDNLAR 3 551 


0 . 


015 


964 


GCGGCGGTG 


2052 


RSDALAR 2 552 


RSDELQR 3 0 52 


RSDERKR 3 5 52 


0 . 


041 


965 


GCGGCGGCC 


2053 


ERGDLTR 2553 


RSDELQR 3 0 53 


RSDERKR 3 5 53 


3 


. 1 


966 


GAGGAGGCC 


2054 


ERGTLAR 2 554 


RSDNLSR 3 054 


RSDNLAR 3 5 54 


0 . 


028 


967 


GAGGAGGCC 


2055 


DRSSLTR 2555 


RSDNLSR 3 055 


RSDNLAR 3 555 


0 . 


055 


968 


GAGGCCGCA 


2056 


QSGSLTR 2556 


DRSSLTR 3 0 56 


RSDNLAR 3 55 6 


1 


.4 


969 


GAGGCCGCA 


2057 


QSGSLTR 2557 


DRSDLTR 3 0 57 


RSDNLAR 3 5 5 7 


0 . 


275 


970 


GTGGGCGCC 


2058 


ERGTLAR 2 5 5 8 


DRSHLAR 3 0 5 8 


RSDALAR 3 558 


1 . 


859 


971 


GTGGGCGCC 


2059 


DRSSLTR 2559 


DRSHLAR 3 0 5 9 


RSDALAR 3 55 9 


0 . 


144 


972 


GTGGGCGCC 


2060 


ERGDLTR 2 5 60 


DRSHLAR 3 0 60 


RSDALAR 3 56 0 


1 . 


748 


973 


GCCGCGGTC 


2061 


DRSALTR 2 5 61 


RSDELQR 3 0 61 


ERGTLAR 3 561 


0 


. 6 


974 


GCCGCGGTC 


2062 


DRSALTR 2 5 62 


RSDELQR 3 0 62 


DRSDLTR 3 562 


0 . 


038 


975 


CAGGCCGCT 


2063 


QSSDLTR 2 563 


DRSSLTR 3 063 


RSDNLRE 3 5 63 


1 


. 1 


976 


CAGGCCGCT 


2064 


QSSDLTR 2 564 


DRSDLTR 3 0 64 


RSDNLRE 3 5 64 


4 . 


. 12 


977 


CTGGCAGTG 


2065 


RSDSLTR 2 565 


QSGSLTR 3 0 65 


RSDALRE 3 5 65 


0 . 


017 


978 


CTGGCAGTG 


2066 


RSDSLTR 2 566 


QSGDLTR 3 0 66 


RSDALRE 3 566 


1 . 


576 


979 


CTGGCGGCG 


2067 


RSSDLTR 2 56 7 


RSDELQR 3 0 67 


RSDALRE 3 567 


1 . 


.59 


980 


CTGGCGGCG 


2068 


RSDDLTR 2 568 


RSDELQR 3 0 68 


RSDALRE 3 5 68 


2 


. 2 
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RSDNLRE 3 569 


0 . 


375 


982 


CCGGGCTGG 


2070 


RSDHLTT 2 57 0 


DRSHLAR 3 0 7 0 


RSDELRE 3570 


0 . 


. 03 


983 


CCGGGCTGG 


2071 


RSDHLTK 2571 


DRSHLAR 3 071 


RSDELRE 3 571 


1 . 


385 


984 


GACGGCGAG 


2072 


RSDNLAR 2 5 72 


DRSHLAR 3 0 72 


DRSNLTR 3 5 72 


1 


. 6 


985 


GACGGCGAG 


2073 


RSDNLAR 2 573 


DRSHLAR 3 0 73 


EKANLTR 3 5 73 


0 . 


965 



41 



986 


GGTGCTGAT 


2074 


QSSNLQR 2 574 


QSSDLQR 3 0 74 


MSHHLSR 3574 


1 


. 6 


987 


GGTGCTGAT 


2075 


QSSNLQR 2 575 


QSSDLQR 3 0 75 


TSGHLVR 3 5 75 


33 


. 55 


988 


GGTGCTGAT 


2076 


TSGNLVR 2 576 


QSSDLQR 3 0 76 


MSHHLSR 3 576 


0 . 


, 15 


989 


GGTGAGGGG 


2077 


RSDHLAR 2 577 


RSDNLAR 3 07 7 


MSHHLSR 3 5 77 


1 


. 9 


990 


AAGGTGGGC 


2078 


DRSHLTR 2 578 


RSDSLAR 3 07 8 


RSDNLTQ 3 5 78 


5 . 


.35 


991 


AAGGTGGGC 


2079 


DRSHLTR 2 57 9 


SSGSLVR 3 07 9 


RSDNLTQ 3 579 


0 . 


.06 


993 


GGGGCTGGG 


2080 


RSDHLAR 2580 


TSGELVR 3 08 0 


RSDHLSR 3 58 0 


3 


. 1 


994 


GGGGGCTGG 


2081 


RSDHLTK 2581 


DRSHLAR 3 0 81 


RSDHLSR 3 581 


0 . 


, 03 


995 


GGGGAGGAA 


2082 


QSANLAR 2 582 


RSDNLAR 3 0 82 


RSDHLSK 3 5 82 


0 . 


, 08 


996 


CAGTTGGTC 


2083 


DRSALAR 2 583 


RSDALTS 3 0 83 


RSDNLRE 3 583 


9 


. 6 


997 


AGAGAGGCT 


2084 


QSSDLTR 2584 


RSDNLAR 3 0 84 


QSGHLNQ 3 5 84 


1 . 


. 65 


998 


ACGTAGTAG 


2085 


RSANLRT 2 5 85 


RSDNLTK 3 0 8 5 


RSDTLKQ 3 58 5 


0 . 


,23 


999 


AGAGAGGCT 


2086 


QSSDLTR 2586 


RSDNLAR 3 08 6 


QSGKLTQ 3 58 6 


0 


. 6 


1000 


CAGTTGGTC 


2087 


DRSALAR 2 5 8 7 


RSDALTR 3 087 


RSDNLRE 3 58 7 


11 


. 15 


1001 


GGAGCTGAC 


2088 


EKANLTR 2 588 


QSSDLSR 3088 


QRAHLAR 3 58 8 


1 


.8 


1002 


GCGGAGGAG 


2089 


RSDNLVR 2 589 


RSDNLAR 3 0 8 9 


RSDERKR 3 58 9 


0 . 


028 


1003 


ACGTAGTAG 


2090 


RSANLRT 2 590 


RSDNLTK 3 0 9 0 


RSDTLRS 3 590 


0 . 


118 


1004 


ACGTAGTAG 


2091 


RSDNLTT 2 591 


RSDNLTK 3 0 91 


RSDTLRS 3591 


1 


. 4 


1006 


GTAGGGGCG 


2092 


RSDDLTR 2 592 


RSDHLTR 3 0 92 


QRASLTR 3 5 92 


0 . 


898 


1007 


GAGAGAGAT 


2093 


QSSNLQR 2 593 


QSGHLTR 3 0 93 


RLHNLAR 3 5 93 


167 


1008 


GAGATGGAG 


2 0 94 


RSDNLSR 2 5 94 


RSDSLTQ 3 0 94 


RLHNLAR 3 5 94 


0 


. 4 


1009 


GAGATGGAG 


2095 


RSDNLSR 2 595 


RSDSLTQ 3 0 95 


RSDNLSR 3 5 95 


1 


. 9 


1010 


GAGAGAGAT 


2096 


QSSNLQR 2 5 96 


QSGHLTR 3 0 96 


RSDNLAR 3 596 


8 


.2 


1011 


TTGGTGGCG 


2097 


RSADLTR 2 5 97 


RSDSLAR 3 0 97 


RSDSLTK 3 5 97 


0 . 


. 03 


1012 


GACGTAGGG 


2098 


RSDHLTR 2 598 


QSSSLVR 3 0 98 


DRSNLTR 3 5 98 


0 . 


032 


1013 


GAGAGAGAT 


2099 


QSSNLQR 2 599 


QSGHLNQ 3 0 99 


RSDNLAR 3 5 99 


0 . 


. 15 


1014 


GACGTAGGG 


2100 


RSDHLTR 2 60 0 


QSGSLTR 310 0 


DRSNLTR 3 60 0 


0 . 


. 01 


1015 


GCGGAGGAG 


2101 


RSDNLVR 2 6 01 


RSDNLAR 3101 


RSDTLKK 3 601 


0 . 


008 


1016 


CAGTTGGTC 


2102 


DRSALAR 2 602 


RSDSLTK 3102 


RSDNLRE 3 6 02 


0 . 


.09 


1017 


CTGGATGAC 


2103 


EKANLTR 2603 


TSGNLVR 3103 


RSDALRE 3 6 03 


0. 


233 


1018 


GTAGTAGAA 


2104 


QSANLAR 2 604 


QSSSLVR 3104 


QRASLAR 3 6 04 


7 


.2 


1019 


AGGGAGGAG 


2105 


RSDNLAR 2 6 05 


RSDNLAR 310 5 


RSDHLTQ 3 605 


0 . 


022 


1020 


ACGTAGTAG 


2106 


RSDNLTT 2 606 


RSDNLTK 3106 


RSDTLKQ 3 60 6 


0 


.69 
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1022 


GAGGAGGTG 


2107 


RSDALAR 2 607 


RSDNLAR 310 7 


RSDNLAR 3 6 07 


0 


.01 


1024 


GGGGAGGAA 


2108 


QSANLAR 2 60 8 


RSDNLAR 310 8 


RSDHLSR 3 60 8 


0 


.08 


1025 


GAGGAGGTG 


2109 


QSSALTR 2 60 9 


QSSSLVR 3109 


RSDTLTQ 3 60 9 


0 . 


115 


1026 


GTGGCTTGT 


2110 


MSHHLKE 2 610 


QSSDLSR 3110 


RSDALAR 3 610 


0 . 


, 076 


1027 


GCGGCGGTG 


2111 


RSDALAR 2 611 


RSDELQR 3111 


RSDELQR 3 611 


0 . 


054 


1032 


GGTGCTGAT 


2112 


TSGNLVR 2 612 


QSSDLQR 3112 


TSGHLVR 3 612 


0 


. 52 


1033 


GTGTTCGTG 


2113 


RSDALAR 2 613 


DRSALTT 3113 


RSDALAR 3 613 


6£ 


55.2 


1034 


GTGTTCGTG 


2114 


RSDALAR 2 614 


DRSALTK 3114 


RSDALAR 3 614 


14 . 55 


1035 


GTGTTCGTG 


2115 


RSDALAR 2 615 


DRSALRT 3115 


RSDALAR 3 615 




56 


1037 


GTAGGGGCA 


2116 


QSGSLTR 2 616 


RSDHLSR 3116 


QRASLAR 3 616 


0 


. 05 


1038 


GTAGGGGCA 


2117 


QTGELRR 2 617 


RSDHLSR 3117 


QRASLAR 3 617 


0 . 


152 


1039 


GGGGCTGGG 


2118 


RSDHLSR 2 618 


TSGELVR 3118 


RSDHLTR 3 618 


1 


.37 


1040 


GGGGCTGGG 


2119 


RSDHLSR 2 619 


QSSDLQR 3119 


RSDHLSK 3619 


0 


. 05 


1041 


TCATAGTAG 


2120 


RSDNLTT 2 62 0 


RSDNLRT 312 0 


QSHDLTK 3 62 0 


2 


. 06 


1043 


CAGGGAGAG 


2121 


RSDNLAR 2 621 


QSGHLTR 3121 


RSDNLRE 3 621 


0 


. 16 


1044 


CAGGGAGAG 


2122 


RSDNLAR 2 622 


QRAHLER 312 2 


RSDNLRE 3 622 


1 


. 07 


1045 


GGGGCAGGA 


2123 


QSGHLAR 2 62 3 


QSGSLTR 312 3 


RSDHLSR 3623 


0 


. 15 


1046 


GGGGCAGGA 


2124 


QSGHLAR 2 624 


QSGDLRR 3124 


RSDHLSR 3624 


0 


. 09 


1047 


GGGGCAGGA 


2125 


QRAHLER 2 62 5 


QSGSLTR 3125 


RSDHLSR 3625 


24 . 7 


1048 


CAGGCTGTA 


2126 


QSGALTR 2 62 6 


QSSDLQR 3126 


RSDNLRE 3 626 


1. 


387 


1049 


CAGGCTGTA 


2127 


QRASLAR 2 62 7 


QSSDLQR 3127 


RSDNLRE 3627 


55 . 6 


1050 


CAGGCTGTA 


2128 


QSSSLVR 2 62 8 


QSSDLQR 312 8 


RSDNLRE 3 62 8 


0 . 


125 


1051 


GAGGCTGAG 


2129 


RSDNLTR 2 62 9 


QSSDLQR 312 9 


RSDNLVR 3 62 9 


0 


. 02 


1052 


TAGGACGGG 


2130 


RSDHLAR 2 63 0 


EKANLTR 313 0 


RSDNLTT 3 63 0 


0 


.28 


1053 


TAGGACGGG 


2131 


RSDHLAR 2 631 


DRSNLTR 3131 


RSDNLTT 3 631 


0 . 


025 


1054 


GCTGCAGGG 


2132 


RSDHLAR 2 63 2 


QSGSLTR 3132 


QSSDLQR 3632 


0 . 


033 


1055 


GCTGCAGGG 


2133 


RSDHLAR 2 63 3 


QSGSLTR 3133 


TSGDLTR 3 63 3 


18 


. 73 


1056 


GCTGCAGGG 


2134 


RSDHLAR 2 634 


QSGSLTR 3134 


QSSDLQR 3634 


0 . 


045 


1057 


GCTGCAGGG 


2135 


RSDHLAR 2 63 5 


QSGDLTR 313 5 


TSGDLTR 3 63 5 


0 . 


483 


1058 


GGGGCCGCG 


2136 


RSDELTR 2 63 6 


DRSSLTR 3136 


RSDHLSR 3 63 6 


6 . 


277 


1059 


GGGGCCGCG 


2137 


RSDELTR 2 63 7 


DRSDLTR 313 7 


RSDHLSR 3 63 7 


0 . 


152 


1060 


GCGGAGGCC 


2138 


ERGTLAR 2 63 8 


RSDNLAR 313 8 


RSDERKR 3 63 8 


0 


.69 


1061 


GTTGCGGGG 


2139 


RSDHLAR 2 63 9 


RSDELQR 313 9 


QSSALTR 3 63 9 


0 . 


165 



43 



1062 


GTTGCGGGG 


2140 


RSDHLAR 2 64 0 


RSDELQR 314 0 


TSGSLTR 3640 


0 . 


068 


1063 


GTTGCGGGG 


2141 


RSDHLAR 2 641 


RSDELQR 3141 


MSHALSR 3 641 


0 . 


. 96 


1064 


GCGGCAGTG 


2142 


RSDALTR 2 642 


QSGSLTR 3142 


RSDERKR 3 642 


0 . 


453 


1065 


TGGGGCGGG 


2143 


RSDHLAR 2 64 3 


DRSHLAR 314 3 


RSDHLTT 3 643 


1 


. 37 


1066 


GAGGGCGGT 


2144 


QSSHLTR 2 644 


DRSHLAR 3144 


RSDNLVR 3 644 


0 . 


. 15 


1067 


GAGGGCGGT 


2145 


TSGHLVR 2 64 5 


DRSHLAR 314 5 


RSDNLVR 3 645 


1 . 


.37 


1068 


GCAGGGGGC 


2146 


DRSHLTR 2 64 6 


RSDHLTR 314 6 


QSGDLTR 3 64 6 


2 . 


. 05 


1069 


GCAGGCGGT 


2147 


DRSHLTR 2 64 7 


RSDHLTR 314 7 


QSGSLTR 3647 


0 


. 1 


1070 


GGGGCAGGC 


2148 


DRSHLTR 2 64 8 


QSGSLTR 3148 


RSDHLSR 3648 


0 . 


456 


1071 


GGGGCAGGC 


2149 


DRSHLTR 2 64 9 


QSGDLTR 314 9 


RSDHLSR 3649 


0 


.2 


1072 


GGATTGGCT 


2150 


QSSDLTR 2 65 0 


RSDALTT 3150 


QRAHLAR 3 65 0 


0 , 


.46 


1073 


GGATTGGCT 


2151 


QSSDLTR 2 651 


RSDALTK 3151 


QRAHLAR 3651 


1 . 


.37 


1075 


GTGTTGGCG 


2152 


RSDELTR 2 652 


RSDALTK 3152 


RSDALTR 3 652 


0 . 


915 


1076 


GCGGCAGCG 


2153 


RSDELTR 2 653 


QSGSLTR 3153 


RSDERKR 3 653 


4 


. 1 


1077 


GCGGCAGCG 


2154 


RSDELTR 2 654 


QSGDLRR 3154 


RSDERKR 3654 


6 


.2 


1078 


GGGGGGGCC 


2155 


ERGTLAR 2 655 


RSDHLSR 3155 


RSDHLSR 3655 


0 


. 2 


1079 


GGGGGGGCC 


2156 


ERGDLTR 2 65 6 


RSDHLSR 3156 


RSDHLSR 3656 


4 


. 1 


1080 


CTGGAGGCG 


2157 


RSDELTR 2 65 7 


RSDNLAR 3157 


RSDALRE 3 657 


1 . 


.37 


1081 


GGGGAGGTG 


2158 


RSDALTR 2 65 8 


RSDNLTR 3158 


RSDHLSR 3 658 


0 . 


. 05 


1082 


CTGGCGGCG 


2159 


RSDELTR 2 65 9 


RSDELTR 3159 


RSDALRE 3 65 9 


0 . 


152 


1083 


CTGGTGGCA 


2160 


QSGDLTR 2 66 0 


RSDALSR 3160 


RSDALRE 3 660 


0 . 


152 


1084 


GGTGAGGCG 


2161 


RSDELTR 2 661 


RSDNLAR 3161 


MSHHLSR 3 661 


0 


. 5 


1085 


GGTGAGGCG 


2162 


RSDELTR 2 662 


RSDNLAR 3162 


QSSHLAR 3 6 62 


0 


.46 


1086 


GGGGCTGGG 


2163 


RSDHLSR 2 663 


QSSDLQR 3163 


RSDHLTR 3 663 


0 


. 1 


1087 


CGGGCGGCC 


2164 


ERGDLTR 2 6 64 


RSDELQR 3164 


RSDHLAE 3 664 


1 


. 24 


1088 


CGGGCGGCC 


2165 


ERGDLTR 2 6 65 


RSDELQR 3165 


RSDHLRE 3 665 


0 . 


905 


1089 


GACGAGGCT 


2166 


QSSDLRR 2 66 6 


RSDNLAR 3166 


DRSNLTR 3 666 


0 . 


171 


1090 


AAGGCGCTG 


2167 


RSDALRE 2 667 


RSDELQR 3167 


RSDNLTQ 3 667 


30.3 




GTAGAGGAC 


2168 


DRSNLTR 2 6 68 


RSDNLAR 3168 


QRASLAR 3668 


0 . 


085 


1092 


GCCTTGGCT 


2169 


QSSDLRR 2 669 


RGDALTS 316 9 


DRSDLTR 3 66 9 


2 . 


735 


1093 


GCGGAGTCG 


2170 


RSADLRT 2 6 7 0 


RSDNLAR 317 0 


RSDERKR 3 67 0 


0 . 


046 


1094 


GCGGTTGGT 


2171 


TSGHLVR 2 671 


QSSALTR 3171 


RSDERKR 3 671 


12 


.34 


1095 


GGGGGAGCC 


2172 


ERGDLTR 2 6 72 


QRAHLER 3172 


RSDHLSR 3 6 72 


0 . 


395 
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1096 


GGGGGAGCC 


2173 


DRSSLTR 2 6 73 


QRAHLER 3173 


RSDHLSR 3 673 


0 .019 


1097 


GAGGCCGAA 


2174 


QSANLAR 2 674 


DCRDLAR 3174 


RSDNLAR 3 674 


0 . 77 


1098 


GCCGGGGAG 


2175 


RSDNLTR 2 675 


RSDHLTR 3175 


DRSDLTR 3 675 


0 . 055 


1099 


GCGGAGTCG 


2176 


TSGHLVR 2 67 6 


TSGSLTR 3176 


RSDERKR 3 67 6 


0 . 45 


1100 


GTGTTGGTA 


2177 


QSGALTR 2 67 7 


RGDALTS 3177 


RSDALTR 3 677 


1.4 


1101 


ATGGGAGTT 


2178 


TTSALTR 2 67 8 


QRAHLER 3178 


RSDALRQ 3 678 


0 . 065 


1102 


AAGGCAGAA 


2179 


QSANLAR 2 67 9 


QSGSLTR 3179 


RSDNLTQ 3 67 9 


8 . 15 


1103 


AAGGCAGAA 


2180 


QSANLAR 2 6 8 0 


QSGDLTR 318 0 


RSDNLTQ 3 68 0 


1 .4 


1104 


CGGGCAGCT 


2181 


QSSDLRR 2 681 


QSGSLTR 3181 


RSDHLRE 3 6 81 


0 . 08 


1105 


CTGGCAGCC 


2182 


ERGDLTR 2 682 


QSGDLTR 3182 


RSDALRE 3 682 


2 .45 


1106 


CTGGCAGCC 


2183 


DRSSLTR 2 683 


QSGDLTR 3183 


RSDALRE 3 683 


0 . 19 


1107 


GCGGGAGTT 


2184 


QSSALAR 2 684 


QRAHLER 3184 


RSDERKR 3 684 


0 . 06 


1108 


CAGGCTGGA 


2185 


QSGHLAR 2 685 


TSGELVR 3185 


RSDNLRE 3 685 


0 . 007 


1109 


AGGGGAGCC 


2186 


ERGDLTR 2 6 8 6 


QRAHLER 3186 


RSDHLTQ 3 686 


0 .347 


1110 


AGGGGAGCC 


2187 


DRSSLTR 2 687 


QRAHLER 3187 


RSDHLTQ 3 687 


0 . 095 


1111 


CTGGTAGGG 


2188 


RSDHLAR 2 68 8 


QSSSLVR 3188 


RSDALRE 3 6 88 


0 . 095 


1112 


CTGGTAGGG 


2189 


RSDHLAR 2 68 9 


QSATLAR 318 9 


RSDALRE 3 689 


0 . 125 


1113 


CTGGGGGCA 


2190 


QSGDLTR 2 690 


RSDHLTR 3190 


RSDALRE 3 690 


0 . 06 


1114 


CAGGTTGAT 


2191 


QSSNLAR 2 6 91 


TSGSLTR 3191 


RSDNLRE 3 691 


2 . 75 


1115 


CAGGTTGAT 


2192 


QSSNLAR 2 6 92 


QSSALTR 3192 


RSDNLRE 3 692 


0 . 7 


1116 


CCGGAAGCG 


2193 


RSDELTR 2 6 93 


QSSNLVR 3193 


RSDELRE 3 6 93 


12 . 3 


1117 


GCAGCGCAG 


2194 


RSSNLRE 2 6 94 


RSDELTR 3194 


QSGSLTR 3694 


2 . 85 


1118 


TAGGGAGTC 


2195 


DRSALTR 2 695 


QRAHLER 3195 


RSDNLTT 3 6 95 


1.4 


1119 


TGGGAGGGT 


2196 


TSGHLVR 2 696 


RSDNLAR 3196 


RSDHLTT 3 696 


0 . 1 


1120 


AGGGACGCG 


2197 


RSDELTR 2 6 97 


DRSNLTR 3197 


RSDHLTQ 3 697 


2 . 735 


1121 


CTGGTGGCC 


2198 


ERGDLTR 2 6 98 


RSDALTR 3198 


RSDALRE 3 698 


2 . 76 


1122 


CTGGTGGCC 


2199 


DRSSLTR 2 6 99 


RSDALTR 3199 


RSDALRE 3 6 99 


0 . 101 


1123 


TAGGAAGCA 


2200 


QSGSLTR 270 0 


QSGNLAR 32 0 0 


RSDNLTT 3 7 00 


0 .065 


1124 


GTGGATGGA 


2201 


QSGHLAR 2 7 01 


TSGNLVR 32 01 


RSDALTR 3701 


0 . 101 


1126 


TTGGCTATG 


2202 


RSDALTS 2 7 02 


TSGELVR 32 02 


RGDALTS 3 7 02 


0.46 


1127 


CAGGGGGTT 


2203 


QSSALAR 2 703 


RSDHLTR 32 03 


RSDNLRE 3 7 03 


0 . 1 


1128 


AAGGTCGCC 


2204 


ERGDLTR 2 704 


DPGALVR 3 2 04 


RSDNLTQ 3 704 


5 .45 


1130 


GGTGCAGAC 


2205 


DRSNLTR 2 7 05 


QSGDLTR 3 2 05 


MSHHLSR 3 705 


0 . 1 



45 



1131 


GTGGGAGCC 


2206 


ERGDLTR 2 70 6 


1132 


GGGGCTGGA 


2207 


QSGHLAR 2 7 07 


1133 


GGGGCTGGA 


2208 


QRAHLER 2 70 8 


1134 


TGGGGGTGG 


2209 


RSDHLTT 2709 


1135 


GCGGCGGGG 


2210 


RSDHLAR 2 710 


1136 


CCGGGAGTG 


2211 


RSDALTR 2711 


1137 


CCGGGAGTG 


2212 


RSSALTR 2 712 


1138 


CAGGGGGTA 


2213 


QSGALTR 2 713 


1139 


ACGGCCGAG 


2214 


RSDNLAR 2 714 


1140 


AAGGGTGCG 


2215 


RSDELTR 2 715 


1141 


ATGGACTTG 


2216 


RGDALTS 2 716 


1148 


TTGGAGGAG 


2217 


RSDNLTR 2 717 


1149 


TTGGAGGAG 


2218 


RSDNLTR 2 718 


1150 


GAAGAGGCA 


2219 


QSGSLTR 2719 


1151 


GTAGTATGG 


2220 


RSDHLTT 2 72 0 


1152 


AAGGCTGGA 


2221 


QSGHLAR 2 721 


1153 


AAGGCTGGA 


2222 


QRAHLAR 2 722 


1154 


CTGGCGTAG 


2223 


RSDNLTT 2 72 3 


1156 


ATGGTTGAA 


2224 


QSAMLAR 2 724 


1157 


ATGGTTGAA 


2225 


QSANLAR 2 72 5 


1158 


AGGGGAGAA 


2226 


QSANLAR 2 72 6 


1159 


AGGGGAG7AA 


2227 


QSANLAR 2 72 7 


1160 


TGGGAAGGC 


2228 


DRSHLAR 2 72 8 


1161 


GAGGCCGGC 


2229 


DRSHLAR 2 72 9 


1162 


GTGTTGGTA 


2230 


QSGALTR 2 73 0 


1163 


GTGTGAGCC 


2231 


ERGDLTR 2 731 


1164 


GTGTGAGCC 


2232 


ERGDLTR 2 73 2 


1165 


GCGAAGGTG 


2233 


RSDALTR 2 7 33 








RSDALTR 2 734 


1167 


GCGAAGGTG 


2235 


RSDALTR 2 73 5 


1168 


AAGGCGCTG 


2236 


RSDALRE 2 73 6 


1169 


GTAGAGGAC 


2237 


DRSNLTR 2 73 7 


1170 


GCCTTGGCT 


2238 


QSSDLRR 2 73 8 



QRAHLER 32 06 


RSDALTR 3 7 0 6 


0 . 95 


TSGELVR 32 0 7 


RSDHLSR 3 7 07 


0 . 055 


TSGELVR 32 0 8 


RSDHLSR 3708 


0 . 5 


RSDHLTR 32 0 9 


RSDHLTT 3 7 0 9 


0 . 067 


RSDELQR 3 210 


RSDERKR 3 710 


0 . 025 


QRAHLER 3 211 


RSDTLRE 3711 


0 .225 


QRAHLER 3 212 


RSDTLRE 3 712 


0 .085 


RSDHLTR 3213 


RSDNLRE 3 713 


0 . 027 


DRSDLTR 3214 


RSDTLTQ 3 714 


0 . 535 


QSSHLAR 3215 


RSDNLTQ 3 715 


0 . 3 


DRSNLTR 3216 


RSDALTQ 3 716 


1 . 7 


RSDNLTR 3217 


RGDALTS 3 717 


0 . 006 


RSDNLTR 3 218 


RSDALTK 3 718 


0 . 004 


RSDNLTR 3219 


QSGNLTR 3 719 


0 . 004 


QRSALAR 322 0 


QRASLAR 3 72 0 


1 . 63 


TSGELVR 3221 


RSDNLTQ 3721 


1 . 605 


TSGELVR 3222 


RSDNLTQ 3 722 


8 . 2 


RSDELQR 3223 


RSDALRE 3 72 3 


1 . 04 


QSSALTR 3224 


RSDALRQ 3 724 


7.2 


TSGSLTR 3225 


RSDALRQ 3 72 5 


0 . 885 


QSGHLTR 322 6 


RSDHLTQ 3 72 6 


0 . 1 


QRAHLER 32 2 7 


RSDHLTQ 3 72 7 


0.555 


QSSNLVR 322 8 


RSDHLTT 372 8 


0 .415 


DRSDLTR 3 22 9 


RSDNLAR 372 9 


0 .45 


RADALMV 3 2 3 0 


RSDALTR 3 73 0 


0 .465 


QSGHLTT 3231 


RSDALTR 3 731 


1 . 45 


QSVHLQS 3 232 


RSDALTR 3 73 2 


15 . 4 


RSDNLTQ 32 3 3 


RSDERKR 3 733 


1 . 4 






0.195 


RSDNLTQ 3 2 3 5 


RSHDRKR 3 73 5 


0 . 95 


RSSDLTR 32 36 


RSDNLTQ 3 73 6 


2.8 


RSDNLAR 3 2 3 7 


QSSSLVR 3 73 7 


0 . 053 


RADALMV 323 8 


DRSDLTR 3 73 8 


2 . 75 
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1171 GCGGAGTCG 22 3 9 

1172 GCCGGGGAG 2240 

1173 GCTGAAGGG 2241 

1174 GCTGAAGGG 2242 

1175 AAGGTCGCC 2243 

1176 GTGGGAGCC 2244 

1177 CCGGGCGCA 2245 

117 8 GAGGATGGC 2 24 6 

1179 GCAGCGCAG 224 7 

1180 AAGGAAAGA 2248 

1181 TTGGCTATG 224 9 

1182 CAGGAAGGC 22 50 

1183 CAGGAAGGC 2251 

1184 AAGGAAAGA 2252 

1185 AAGGAAAGA 2253 

118 6 GCCGAGGTG 2 2 54 
118 7 CTGGTGGGC 22 55 
118 8 GTAGTATGG 22 5 6 

1189 ATGGTTGAA 2257 

1190 ATGGCAGTG 22 5 8 

1191 ATGGCAGTG 22 5 9 

1192 ATGGCAGTG 22 60 

1193 ATGGCAGTG 22 61 

1194 GAGAAGGTG 2 2 62 

1195 GAGAAGGTG 22 63 
1197 GAAGGTGCC 2 2 64 

1199 ATGGAGAAG 2265 

1200 ATGGAGAAG 2266 
12 01 ATGGAGAAG 22 67 
12 02 CTGGAGTAC 22 68 
12 03 GGAGTACTG 22 69 
12 04 GGAGTACTG 22 7 0 
12 05 CGGGCAGCT 22 71 



RSDDLRT 2 73 9 


RSDNLAR 32 3 9 


RSDERKR 3 73 9 


0 . 


. 18 


RSDNLTR 2 74 0 


RSDHLTR 3 24 0 


ERGDLTR 3 74 0 


0 . 


.01 


RSDHLSR 2 741 


QSGNLAR 3 241 


QSSDLRR 3 741 


0 . 


008 


RSDHLSR 2 742 


QSSNLVR 3 242 


QSSDLRR 3742 


0 . 


018 


DRSDLTR 2 743 


DPGALVR 3243 


RSDNLTQ 3 743 


8 


. 9 


DRSDLTR 2 74 4 


QRAHLER 3 244 


RSDALTR 3 744 


4 


. 1 


QSGSLTR 2 74 5 


DRSHLAR 324 5 


RSDTLRE 3 74 5 


4 


. 1 


DRSHLAR 2 74 6 


TSGNLVR 3 24 6 


RSDNLAR 3 74 6 


0 . 


085 


RSSNLRE 2 74 7 


RSSDLTR 3 24 7 


QSGSLTR 3 74 7 


2 . 


735 


QSGHLNQ 2 74 8 


QSGNLAR 3 24 8 


RSDNLTQ 3 74 8 


4 . 


825 


RSDALRQ 2749 


TSGELVR 3 24 9 


RGDALTS 3 74 9 


8 


. 2 


DRSHLAR 275 0 


QSGNLAR 32 5 0 


RSDNLRE 3 750 


1 


.48 


DRSHLAR 2 751 


QSSNLVR 32 51 


RSDNLRE 3 751 


1 . 


935 


KNWKLQA 2 752 


QSGNLAR 3 2 52 


RSDNLTQ 3 752 


2 . 


785 


KNWKLQA 2 753 


QSHNLAR 3 2 53 


RSDNLTQ 3 753 


5 


.25 


RSDSLLR 2754 


RSKNLQR 3 2 54 


ERGTLAR 3 754 


27 . 5 


DRSHLAR 2 755 


RSDALTR 3 255 


RSDALRE 3 755 


0 . 


006 


RSDHLTT 2 75 6 


QSSSLVR 3256 


QRASLAR 3 756 


2 


. 74 


QSANLAR 2 757 


TSGALTR 3257 


RSDALRQ 3 757 


1 


. 51 


RSDALTR 2 758 


QSGDLTR 32 58 


RSDSLNQ 3 758 


1 . 


484 


RSDALTR 2 75 9 


QSGSLTR 32 59 


RSDSLNQ 3 75 9 


5 . 


325 


RSDALTR 2 760 


QSGDLTR 32 60 


RSDALTQ 3 760 


2 . 


364 


RSDALTR 2761 


QSGSLTR 3261 


RSDALTQ 3 761 


3 . 


125 


RSDALTR 2 7 62 


RSDNRTA 32 62 


RSDNLTR 3 7 62 


2 


. 19 


RSDALTR 2 7 63 


RSDNRTA 32 63 


RSSNLTR 3 763 




! . 8 


ERGDLTR 2 7 64 


MSHHLSR 32 64 


QSGNLTR 3 7 64 


14 . 8 


RSDNRTA 2 765 


RSDNLTR 3 2 65 


RSDALTQ 3 7 65 


3 . 


428 


RSDNRTA 2 7 6 6 


RSSNLTR 32 6 6 


RSDALTQ 3 766 


16 . 87 


RSDNRTA 2 7 67 


RSHNLTR 32 67 


RSDALTQ 3 7 67 


14 . 8 


DRSNLRT 2 768 


RSDNLTR 32 6 8 


RSDALRE 3 768 


2 . 


.834 


RSDALRE 2 7 6 9 


QRSALAR 3 2 69 


QRAHLAR 3 7 6 9 


2 . 


. 94 5 


RSDALRE 2 770 


QSSSLVR 3270 


QRAHLAR 3 7 7 0 


4 


.38 


QSSDLRR 2 771 


QSGDLTR 3271 


RSDHLRE 3 771 


( 


) . 9 
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1206 


GCGGGAGTT 


2272 


TTSALTR 2 772 


QRAHLER 3 2 72 


RSDERKR 3 7 72 


0 . 


034 


1207 


CAGGCTGGA 


2273 


QRAHLER 2 7 73 


TSGELVR 32 73 


RSDNLRE 3 7 7 3 


0 


.45 


1209 


CCGGAAGCG 


2274 


RSDELTR 2 774 


QSSNLVR 32 74 


RSDTLRE 3 7 74 


IS 


> .28 


1211 


GCAGCGCAG 


2275 


RSDNLRE 2 775 


RSDELTR 32 75 


QSGSLTR 3 775 


6 . 5 


1212 


CAGGGGGTT 


2276 


TTSALTR 2 776 


RSDHLTR 32 7 6 


RSDNLRE 3 776 


0 


. 05 


1213 


GAAGAAGAG 


2277 


RSDNLTR 2 77 7 


QSSNLVR 32 7 7 


QSGNLTR 3 777 


12 .3 


1214 


ATGGGAGTT 


2278 


TTSALTR 2 778 


QRAHLER 32 7 8 


RSDALTQ 3 778 


0 


.46 


1215 


GTGGGGGCT 


2279 


QSSDLRR 2 779 


RSDHLTR 32 7 9 


RSDALTR 3 779 


0 . 


003 


1217 


GAAGAGGCA 


2280 


QSGSLTR 2 78 0 


RSDNLTR 3 2 8 0 


QSANLTR 3780 


0 . 


004 


1218 


GCGGTGAGG 


2281 


RSDHLTQ 2 781 


RSQALTR 32 81 


RSDERKR 3 781 


0 


.46 


1219 


AAGGAAAGG 


2282 


RSDHLTQ 2 782 


QSHNLAR 3 2 82 


RSDNLTQ 3 782 


0 


. 68 


1220 


AAGGAAAGG 


2283 


RSDHLTQ 2 783 


QSGNLAR 3 2 83 


RSDNLTQ 3 7 83 


0 . 


175 


1221 


AAGGAAAGG 


2284 


RSDHLTQ 2 784 


QSSNLVR 3 2 84 


RSDNLTQ 3 784 


1.4 


1222 


CAGGAGGGC 


2285 


DRSHLAR 2 785 


RSDNLAR 32 8 5 


RSDNLRE 3 7 8 5 


0 . 


155 


1223 


ATGGACTTG 


2286 


RSDALTK 2786 


DRSNLTR 3 2 86 


RSDALTQ 3 7 8 6 




7 


1224 


ATGGACTTG 


2287 


RADALMV 2 7 87 


DRSNLTR 3 2 8 7 


RSDALTQ 3 7 8 7 




12 


1227 


GAATAGGGG 


22 88 


RSDHLSR 2788 


RSDHLTK 328 8 


QSGNLAR 3 7 8 8 




25 


122 8 


ACGGCCGAG 


2289 


RSDNLAR 2 78 9 


DRSDLTR 328 9 


RSDDLTQ 3 78 9 




12 


1229 


AAGGGTGCG 


2290 


RSDELTR 2 790 


MSHHLSR 32 90 


RSDNLTQ 3 7 90 


E 


5 . 2 


1230 


AAGGGAGAC 


2291 


DRSNLTR 2 791 


QSGHLTR 32 91 


RSDNLTQ 3 7 91 


0 . 


383 


1231 


AAGGGAGAC 


2292 


DRSNLTR 2 7 92 


QRAHLER 3 2 92 


RSDNLTQ 3 7 92 


0 . 


213 


1232 


TGGGACCTG 


2293 


RSDALRE 2 793 


DRSNLTR 32 93 


RSDHLTT 3 7 93 


0 . 


, 113 


1233 


TGGGACCTG 


2294 


RSDALRE 2 7 94 


DRSNLTR 32 94 


RSDHLTT 3 7 94 


0 . 


. 635 


1234 


GAGTAGGCA 


2295 


QSGSLTR 2795 


RSDNLTK 32 95 


RSDNLAR 3 7 95 


0 . 


,101 


1236 


GAGTAGGCA 


2296 


QSGSLTR 2796 


RSDHLTT 32 96 


RSDNLAR 3 7 96 


0 . 


, 065 


1237 


GAAGGAGAG 


2297 


RSDNLAR 2 797 


QRAHLER 32 97 


QSGNLAR 3 7 97 


0 . 


, 065 


1238 


CTGGATGTT 


2298 


QSSALAR 2 7 98 


TSGNLVR 3 2 98 


RSDALRE 3 7 9 8 


0 . 


.313 


1239 


CAGGACGTG 


2299 


RSDALTR 2 799 


DPGNLVR 3 2 99 


RSDNLKD 3 7 99 


0. 


. 144 




GGGGAGGCA 




yoboLIK zoUU 




rCoJ-JrlJ_ioxv o o U U 


0 . 


. 056 


1241 


GAGGTGTCA 


2301 


QSHDLTK 2 8 01 


RSDALAR 3 3 01 


RSDNLAR 3 8 01 


0 , 


. 027 


1242 


GGGGTTGAA 


2302 


QSANLAR 2 8 02 


TSGSLTR 3 3 02 


RSDHLSR 3 8 02 


0 


. 02 


1243 


GGGGTTGAA 


2303 


QSANLAR 2 8 03 


QSSALTR 3 3 03 


RSDHLSR 3 8 03 


0 . 


. 101 


1244 


GTCGCGGTG 


2304 


RSDALTR 2 8 04 


RSDELQR 3 3 04 


DRSALAR 3 8 04 


0 , 


. 044 
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1245 


GTCGCGGTG 


2305 


RSDALTR 2£ 


505 


RSDELQR 33 05 


DSGSLTR 3£ 


505 


0. 


102 


1246 


GTGGTTGCG 


2306 


RSDELTR 2£ 


506 


TSGSLTR 33 0 6 


RSDALTR 3£ 


506 


0. 


051 


1247 


GTGGTTGCG 


2307 


RSDELTR 2£ 


507 


TSGALTR 3 3 07 


RSDALTR 3 £ 


507 


0 . 


117 


1248 


GTCTAGGTA 


2308 


QSGALTR 2£ 


508 


RSDNLTT 3 3 08 


DRSALAR 3 £ 


508 


5 


. 14 


1249 


CCGGGAGCG 


2309 


RSDELTR 2£ 


509 


QSGHLTR 3 3 0 9 


RSDTLRE 3£ 


509 


0 


.26 


1250 


GAAGGAGAG 


2310 


RSDNLAR 2£ 


510 


QSGHLTR 3 310 


QSGNLAR 3 £ 


510 


0 


.31 


1252 


CCGGCTGGA 


2311 


QRAHLER 2£ 


511 


QSSDLTR 3311 


RSDTLRE 3 £ 


511 


0 . 


153 


1253 


CCGGGAGCG 


2312 


RSDELTR 2£ 


512 


QRAHLER 3 312 


RSDTLRE 3 £ 


512 


0 . 


228 


1255 


ACGTAGTAG 


2313 


RSDNLTT 2813 


RSDNLTK 3 313 


RSDTLKQ 3 £ 


513 


0 


. 69 


1256 


GGGGAGGAT 


2314 


QSSNLAR 2£ 


514 


RSDNLQR 3 314 


RSDHLSR 3£ 


314 




2 


1257 


GGGGAGGAT 


2315 


TTSNLAR 2£ 


515 


RSDNLQR 3 315 


RSDHLSR 3 £ 


315 




1 


1258 


GGGGAGGAT 


2316 


QSSNLRR 2f 


516 


RSDNLQR 3 316 


RSDHLSR 3 £ 


316 




2 


1259 


GAGTGTGTG 


2317 


RSDSLLR 2£ 


517 


DRDHLTR 3 317 


RSDNLAR 3 £ 


317 


1 


. 5 


1260 


GAGTGTGTG 


2318 


RLDSLLR 2£ 


518 


DRDHLTR 3 318 


RSDNLAR 3 £ 


518 


1 


. 8 


1261 


TGCGGGGCA 


2319 


QSGDLTR 2£ 


519 


RSDHLTR 3319 


RRDTLHR 3 £ 


519 


0 


.2 


1262 


TGCGGGGCA 


2320 


QSGDLTR 2 J 


520 


RSDHLTR 3 3 2 0 


RLDTLGR 3 £ 


520 




3 


1263 


TGCGGGGCA 


2321 


QSGDLTR 2£ 


521 


RSDHLTR 3 321 


DSGHLAS 3 £ 


521 


21 


1264 


AAGTTGGTT 


2322 


TTSALTR 2£ 


522 


RADALMV 3 322 


RSDNLTQ 3 £ 


322 


0 


.21 


1265 


AAGTTGGTT 


2323 


TTSALTR 2 823 


RSDALTT 332 3 


RSDNLTQ 3 82 3 


0 . 


077 


1266 


CAGGGTGGC 


2324 


DRSHLTR 2824 


QSSHLAR 3324 


RSDNLRE 3 824 


6 


. 1 


1267 


TAGGCAGTC 


2325 


DRSALTR 2 £ 


325 


QSGSLTR 3 32 5 


RSDNLTT 3 £ 


325 




6 


1268 


CTGTTGGCT 


2326 


QSSDLTR 2826 


RADALMV 3 32 6 


RSDALRE 3 82 6 


1 


. 52 


1269 


CTGTTGGCT 


2327 


QSSDLTR 2 £ 


527 


RSDALTT 3 3 2 7 


RSDALRE 3 £ 


327 


12 . 3 


1270 


TTGGATGGA 


2328 


QSGHLAR 2 82 8 


TSGNLVR 332 8 


RSDALTK 3 82 8 


0 


.4 


1271 


GTGGCACTG 


2329 


RSDALRE 2 £ 


329 


QSGSLTR 332 9 


RSDALTR 3 82 9 


0 . 


915 


1272 


CAGGAGTCC 


2330 


DRSSLTT 2 £ 


330 


RSDNLAR 3 3 3 0 


RSDNLRE 3 £ 


330 


0 


. 04 


1273 


CAGGAGTCC 


2331 


ERGDLTT 2£ 


331 


RSDNLAR 3 3 31 


RSDNLRE 3 £ 


331 


0 


. 1 


12 74 


GCATGGGAA 


2332 


QSANLSR 2£ 


332 


RSDHLTT 3 3 32 


QSGSLTR 3 £ 


332 


0 . 


306 




uLn J. OVjTO^-i_rt. 


2 3 3 3 


QRSNLVR 2 833 


RSDHLTT 3 3 3 3 


QSGSLTR 3 £ 


333 


0 . 


326 


1276 


TAGGAAGAG 


2334 


RSDNLAR 2£ 


334 


QRSNLVR 3 3 34 


RSDNLTT 3 £ 


334 


0 . 


685 


1277 


GAAGAGGGG 


2335 


RSDHLAR 2 83 5 


RSDNLAR 3 3 3 5 


QSGNLTR 3 £ 


335 


0 . 


421 


1278 


GAGTAGGCA 


2336 


QSGSLTR 2836 


RSDNLRT 3 3 36 


RSDNLAR 3 £ 


336 


0 . 


019 


1279 


GAGGTGTCA 


2337 


QSGDLRT 21 


337 


RSDALAR 333 7 


RSDNLAR 3 £ 


337 


0 . 


025 



49 



1282 


TCGGTCGCC 


2338 


ERGDLTR 2 83 8 


DPGALVR 333 8 


RSDELRT 3 83 8 


74 . 1 


1287 


GTGGTAGGA 


2339 


QSGHLAR 2 £ 


539 


QSGALAR 333 9 


RSDALTR 3 £ 


539 


0 . 


152 


1288 


CAGGGTGGC 


2340 


DRSHLTR 2 84 0 


QSSHLAR 334 0 


RSDNLTE 3 84 0 


4 


. 1 


1289 


TAGGCAGTC 


2341 


DRSALTR 2 £ 


541 


QSGSLTR 3341 


RSDNLTK 3 £ 


341 


1 


.37 


1290 


GTGGTGATA 


2342 


QSGALTQ 2 842 


RSHALTR 3342 


RSDALTR 3 £ 


542 


24 


. 05 


1291 


GTGGTGATA 


2343 


QQASLNA 2£ 


543 


RSHALTR 3 34 3 


RSDALTR 3 £ 


343 


20 


. 55 


1292 


TTGGATGGA 


2344 


QSGHLAR 2 £ 


544 


TSGNLVR 3 344 


RSDALTT 3 £ 


344 


4 


. 12 


1293 


AAGGTAGGT 


2345 


TSGHLVR 2£ 


545 


QSGALAR 3 34 5 


RSDNLTQ 3 £ 


345 


0 . 


457 


1294 


AAGGTAGGT 


2346 


MSHHLSR 2846 


QSGALAR 3 34 6 


RSDNLTQ 3 £ 


346 


2 


. 75 


1295 


CAGGAGTCC 


2347 


DRSSLTT 2£ 


347 


RSDNLAR 3 347 


RSDNLTE 3 £ 


347 


0 . 


116 


1296 


CAGGAGTCC 


2348 


ERGDLTT 2 848 


RSDNLAR 3 348 


RSDNLTE 3 84 8 


37 


1297 


TAGGAAGAG 


2349 


RSDNLAR 2£ 


349 


QRSNLVR 3 3 49 


RSDNLTK 3 £ 


349 


0 


. 05 


1298 


CAGGACGTG 


2350 


RSDLATR 2 £ 


350 


DPGNLVR 3 35 0 


RSDNLTE 3 8 5 0 


0 


. 05 


1300 


GTCTAGGTA 


2351 


QSGALTR 2£ 


351 


RSDNLTK 3351 


DRSALAR 3 £ 


351 


0 


.46 


1302 


CCGGCTGGA 


2352 


QSGHLTR 2 £ 


352 


QSSDLTR 3352 


RSDTLRE 3 £ 


352 


0 


. 05 


1303 


TAGGAGTTT 


2353 


QRSALAS 2 £ 


353 


RSDNLAR 3353 


RSDNLTT 3 £ 


353 


0 . 


088 


1306 


CTGGCCTTG 


2354 


RSDALTT 2 £ 


354 


DCRDLAR 3 354 


RSDALRE 3 £ 


354 


2 . 


285 


1308 


TGGGCAGCC 


2355 


ERGTLAR 2 £ 


355 


QSGSLTR 3355 


RSDHLTT 3 £ 


355 


0 . 


305 


1309 


TAGGAGTTT 


2356 


QSSALAS 2£ 


356 


RSDNLAR 33 56 


RSDNLTT 3 £ 


356 


0 . 


184 


1310 


TAGGAGTTT 


2357 


TTSALAS 2 £ 


357 


RSDNLAR 33 57 


RSDNLTT 3857 


0 . 


075 


1311 


TGGGCAGCC 


2358 


ERGDLAR 2 £ 


358 


QSGSLTR 3 3 58 


RSDHLTT 3£ 


358 


0 


. 91 


1312 


GGGGCGTGA 


2359 


QSGHLTK 2 £ 


359 


RSDELQR 3 3 59 


RSDHLSR 3£ 


359 


0 


.23 


1313 


GGGGCGTGA 


2360 


QSGHLTT 2 £ 


360 


RSDELQR 33 60 


RSDHLSR 3 £ 


360 


0 


.09 


1314 


GTACAGTAG 


2361 


RSDNLTT 2 861 


RSDNLRE 33 61 


QSSSLVR 3i 


361 


3 


.09 


1315 


GTACAGTAG 


2362 


RSDNLTT 2 £ 


362 


RSDNLTE 3 3 62 


QSSSLVR 3 J 


362 


9 


.27 


1318 


ATGGTGTGT 


2363 


TSSHLAS 2£ 


363 


RSDALAR 3 363 


RSDALAQ 3 863 


0 . 


048 


1319 


ATGGTGTGT 


2364 


MSHHLTT 2 £ 


364 


RSDALAR 33 64 


RSDALAQ 3 1 


364 


0 . 


228 


1320 


TTGGGAGAG 


2365 


RSDNLAR 2 £ 


365 


QRAHLER 33 65 


RSDALTT 3 J 


365 


0 . 


044 




TTGGGAGAG 


2366 


RSDNLAR 2 £ 


366 


QRAHLER 33 66 


RADALMV 31 


366 


0 . 


127 


1322 


GTGGGAATA 


2367 


QSGALTQ 2 £ 


367 


QSGHLTR 33 67 


RSDALTR 3 J 


367 


0 . 


799 


1323 


GTGGGAATA 


2368 


QLTGLNQ 2 I 


368 


QSGHLTR 33 6 8 


RSDALTR 3! 


368 


0 . 


744 


1324 


GTGGGAATA 


2369 


QQASLNA 2 £ 


369 


QSHHLTR 33 6 9 


RSDALTR 31 


369 


1£ 


! . 52 


1325 


TTGGTTGGT 


2370 


TSGHLVR 21 


370 


TSGSLTR 3370 


RSDALTK 3 1 


370 


0 . 


306 



50 



1326 


TTGGTTGGT 


2371 


TSGHLVR 2£ 


S71 


QSSALTR 3 3 71 


RSDALTK 3 £ 


171 


4.385 


1327 


TTGGTTGGT 


2372 


TSGHLVR 2 £ 


572 


TSGSLTR 3 3 72 


RSDALTT 3£ 


!72 


0 . 566 


1328 


TTGGTTGGT 


2373 


TSGHLVR 2£ 


573 


QSSALTR 3 3 73 


RSDALTT 3 £ 


573 


7 . 95 


1329 


CTGGCCTGG 


2374 


RSDHLTT 2£ 


374 


DRSDLTR 3374 


RSDALRE 3 £ 


574 


0 . 68 


1330 


GAGGTGTGA 


2375 


QSGHLTT 2£ 


375 


RSDALTR 33 75 


RSDNLAR 3 £ 


375 


0 . 175 


1331 


CTGGCCTGG 


2376 


RSDHLTT 2£ 


376 


DCRDLAR 33 76 


RSDALRE 3 £ 


376 


0 .388 


1334 


CCGGCGCTG 


2377 


RSDALRE 2 £ 


377 


RSSDLTR 33 77 


RSDDLRE 3 £ 


377 


0.31 


1335 


GACGCTGGC 


2378 


DRSHLTR 2 £ 


378 


QSSDLTR 3378 


DSSNLTR 3 £ 


378 


1.4 


1336 


CGGGCTGGA 


2379 


QSGHLAR 2 £ 


379 


QSSDLTR 3379 


RSDHLAE 3£ 


379 


1.4 


1337 


CGGGCTGGA 


2380 


QSSHLAR 2£ 


380 


QSSDLTR 33 8 0 


RSDHLAE 3 880 


0 .235 


1338 


GGGATGGCG 


2381 


RSDELTR 2£ 


381 


RSDALTQ 3 3 81 


RSDHLSR 3£ 


381 


1 . 04 


1339 


GGGATGGCG 


2382 


RSDELTR 2 £ 


382 


RSDSLTQ 3382 


RSDHLSR 3£ 


382 


0 . 569 


1340 


GGGATGGCG 


2383 


RSDELTR 2£ 


383 


RSDALTQ 33 83 


RSHHLSR 3£ 


383 


0 . 751 


1341 


GGGATGGCG 


2384 


RSDELTR 2 £ 


384 


RSDSLTQ 3384 


RSHHLSR 3£ 


384 


4 . 1 


1342 


CAGGCGCAG 


2385 


RSDNLRE 2 £ 


385 


RSSDLTR 3385 


RSDNLTE 3£ 


385 


0 . 68 


1343 


CAGGCGCAG 


2386 


RSDNLTT 2 £ 


386 


RTSTLTR 33 8 6 


RSDNLTE 3£ 


386 


37 . 04 


1344 


CCGGGCGAC 


2387 


DRSNLTR 2 £ 


387 


DRSHLAR 3 3 87 


RSDTLRE 3£ 


387 


2.28 


1346 


GATGTGTGA 


2388 


QSGHLTT 2£ 


388 


RSDALAR 3 3 88 


TSANLSR 3 £ 


388 


0.153 


1347 


CAGTGAATG 


2389 


RSDALTS 2 £ 


389 


QSHHLTT 3 3 89 


RSDNLTE 3 £ 


389 


8 .23 


1348 


GGGTCACTG 


2390 


RSDALTA 2 £ 


390 


QAATLTT 3 3 9 0 


RSDHLSR 3£ 


390 


2 . 58 


1350 


CAGTGAATG 


2391 


RSDALTQ 2 £ 


391 


QSGHLTT 3 3 91 


RSDNLTE 3£ 


391 


74 . 1 


1351 


GGGTCACTG 


2392 


RSDALRE 2 £ 


392 


QSHDLTK 3 3 92 


RSDHLSR 3£ 


392 


0 .234 


1352 


GTGTGGGTC 


2393 


DRSALAR 2 £ 


393 


RSDHLTT 33 93 


RSDALTR 3 £ 


393 


0 . 023 


1353 


CTGGCGAGA 


2394 


QSGHLNQ 2 £ 


394 


RSDELQR 33 94 


RSDALRE 3 i 


394 


56.53 


1354 


CTGGCGAGA 


2395 


KNWKLQA 2 £ 


395 


RSDELQR 3 3 95 


RSDALRE 3! 


395 


20 . 85 


1355 


GCTTTGGCA 


2396 


QSGSLTR 2 £ 


396 


RSDALTT 3 3 96 


QSSDLTR 3! 


396 


0 . 172 


1356 


GCTTTGGCA 


2397 


QSGSLTR 2 1 


397 


RADALMV 3 3 97 


QSSDLTR 3! 


397 


0 . 034 


1357 


GACTTGGTA 


2398 


QSSSLVR 21 


398 


RSDALTT 3 3 98 


DRSNLTR 3 i 


398 


0 . 032 








QSSSLVR 2 J 


399 


RADALMV 3 3 9 9 


DRSNLTR 3 i 


399 


0 . 05 


1360 


CAGTTGTGA 


2400 


QSGHLTT 2 90 0 


RADALMV 34 00 


RSDNLTE 3 ! 


900 


41 . 7 


1361 


AAGGAAAAA 


2401 


QKTNLDT 2 901 


QSGNLQR 34 01 


RSDNLTQ 3 9 01 


0 . 835 


1362 


AAGGAAAAA 


2402 


QSGNLNQ 2 902 


QSGNLQR 34 02 


RSDNLTQ 3 : 


902 


0 . 332 


1363 


AAGGAAAAA 


2403 


QKTNLDT 2 ! 


903 


QRSNLVR 34 03 


RSDNLTQ 3 


903 


74 . 1 



51 



1364 


ATGGGTG7AA 


2404 


QSANLSR 2 904 


QSSHLAR 34 04 


RSDALAQ 3 904 


1 , 


.22 


1365 


ATGGGTGAA 


2405 


QRSNLVR 2 90 5 


QSSHLAR 34 05 


RSDALAQ 3 905 


0 . 


152 


1366 


ATGGGTGAA 


2406 


QSANLSR 2 906 


TSGHLVR 34 06 


RSDALAQ 3 906 


22 


.63 


1367 


ATGGGTGAA 


2407 


QRSNLVR 2 9 07 


TSGHLVR 34 07 


RSDALAQ 3 907 


1 . 


028 


1368 


CTGGGAGAT 


2408 


QSSNLAR 2 90 8 


QRAHLER 3408 


RSDALRE 3 908 


0 . 


051 


1369 


CTGGGAGAT 


2409 


QSSNLAR 2 90 9 


QSGHLTR 34 0 9 


RSDALRE 3 90 9 


0 . 


227 


1373 


GTGGTGGGC 


2410 


DRSHLTR 2 910 


RSDALSR 3410 


RSDALTR 3 910 


0 . 


025 


1374 


CCGGCGGTG 


2411 


RSDALTR 2 911 


RSDELQR 3411 


RSDELRE 3 911 


0 . 


003 


1375 


CCGGCGGTG 


2412 


RSDALTR 2 912 


RSDDLQR 3412 


RSDELRE 3 912 


0 . 


008 


1376 


CCGGCGGTG 


2413 


RSDALTR 2 913 


RSDERKR 3413 


RSDELRE 3 913 


0 . 


858 


1377 


CCGGCGGTG 


2414 


RSDALTR 2 914 


RSDELQR 3414 


RSDDLRE 3 914 


0 . 


012 


1378 


CCGGCGGTG 


2415 


RSDALTR 2 915 


RSDDLQR 3415 


RSDDLRE 3 915 


0 . 


012 


1379 


CCGGCGGTG 


2416 


RSDALTR 2 916 


RSDERKR 3416 


RSDDLRE 3 916 


0 


.25 


1380 


GCCGACGGT 


2417 


QSSHLTR 2 917 


DRSNLTR 3417 


ERGDLTR 3 917 


0 . 


076 


1381 


GCCGACGGT 


2418 


QSSHLTR 2 918 


DPGNLVR 3418 


ERGDLTR 3 918 


0 


.23 


1382 


GCCGACGGT 


2419 


QSSHLTR 2 919 


DRSNLTR 3419 


DCRDLAR 3 919 


3 


. 1 


1383 


GCCGACGGT 


2420 


QSSHLTR 2 92 0 


DPGNLVR 342 0 


DCRDLAR 3 92 0 


1 


.74 


1384 


GGTGTGGGC 


2421 


DRSHLTR 2 921 


RSDALSR 3421 


MSHHLSR 3 921 


0 . 


013 


1385 


TGGGCAAGA 


2422 


QSGHLNQ 2 92 2 


QSGSLTR 342 2 


RSDHLTT 3 92 2 


0 . 


229 


1386 


TGGGCAAGA 


2423 


ENWKLQA 2 92 3 


QSGSLTR 342 3 


RSDHLTT 3 92 3 


0 . 


193 


1389 


CTGGCCTGG 


2424 


RSDHLTT 2 924 


DCRDLAR 3424 


RSDALRE 3 924 


0 . 


175 


1393 


TGGGAAGCT 


2425 


QSSDLRR 2 92 5 


QSGNLAR 342 5 


RSDHLTT 3 92 5 


c 


) . 1 


1394 


TGGGAAGCT 


2426 


QSSDLRR 2 92 6 


QSGNLAR 342 6 


RSDHLTK 3 92 6 


0 


. 04 


1395 


GAAGAGGGA 


2427 


QSGHLQR 2 92 7 


RSDNLAR 342 7 


QSGNLAR 3 92 7 


0 . 


025 


1396 


GAAGAGGGA 


2428 


QRAHLAR 2 92 8 


RSDNLAR 342 8 


QSGNLAR 3 92 8 


0 . 


107 


1397 


GAAGAGGGA 


2429 


QSSHLAR 2929 


RSDNLAR 342 9 


QSGNLAR 3 92 9 


0 


. 14 


1398 


TAATGGGGG 


2430 


RSDHLSR 2930 


RSDHLTT 343 0 


QSGNLRT 3 93 0 


0 . 


065 


1399 


TGGGAGTGT 


2431 


TKQHLKT 2 931 


RSDNLAR 34 31 


RSDHLTT 3 931 


0 . 1 


1400 


CCGGGTGAG 


2432 


RSDNLAR 2 932 


QSSHLAR 3432 


RSDDLRE 3 932 


0 . 


371 


1401 


GAGTTGGCC 


2433 


ERGTLAR 2 933 


RADALMV 343 3 


RSDNLAR 3 93 3 


0 . 


, 167 


1402 


CTGGAGTTG 


2434 


RGDALTS 2 934 


RSDNLAR 3 434 


RSDALRE 3 934 


0 


. 15 


1403 


ATGGCAATG 


2435 


RSDALTQ 2 93 5 


QSGSLTR 343 5 


RSDALTQ 3 93 5 


0 


. 07 


1404 


GAGGCAGGG 


2436 


RSDHLSR 2936 


QSGSLTR 3436 


RSDNLAR 3 93 6 


0 . 


. 022 



52 



1405 


GAGGCAGGG 


2437 


RSDHLSR 2 93 7 


QSGDLTR 34 3 7 


RSDNLAR 3 93 7 


0 . 


045 


1406 


GAAGCGGAG 


2438 


RSDNLAR 2 93 8 


RSDELTR 343 8 


QSGNLAR 3 93 8 


0 . 


025 


14 0 7 


GCGGGCGCA 


2439 


QSGSLTR 2 93 9 


DRSHLAR 34 3 9 


RSDERKR 3 93 9 


0 . 


585 


1408 


CCGGCAGGG 


2440 


RSDHLSR 2 94 0 


QSGSLTR 344 0 


RSDELRE 3 94 0 


0 . 


305 


14 0 9 


CCGGCAGGG 


2441 


RSDHLSR 2 941 


QSGSLTR 3441 


RSDDLRE 3 941 


0 . 


153 


1410 


CCGGCGGCG 


2442 


RSDELTR 2 942 


RSDELQR 3 44 2 


RSDELRE 3 942 


0 . 


814 


1411 


TGAGGCGAG 


2443 


RSDNLAR 2 943 


DRSHLAR 3443 


QSGHLTK 3 943 


0 . 


282 


1412 


CTGGCCGTG 


2444 


RSDSLLR 2 944 


ERGTLAR 3444 


RSDALRE 3 944 


0 . 


172 


1413 


CTGGCCGCG 


2445 


RSDELTR 2 94 5 


DRSDLTR 344 5 


RSDALRE 3 94 5 


0 . 


152 


1414 


CTGGCCGCG 


2446 


RSDELTR 2 94 6 


ERGTLAR 344 6 


RSDALRE 3 94 6 


0 . 


914 


1415 


GCGGCCGAG 


2447 


RSDNLAR 2 94 7 


DRSDLTR 344 7 


RSDELQR 3 94 7 


0 . 


102 


1416 


GCGGCCGAG 


2448 


RSDNLAR 2 94 8 


ERGTLAR 3 44 8 


RSDELQR 3 94 8 


0 . 


153 


1417 


GAGTTGGCC 


2449 


ERGTLAR 2 94 9 


RGDALTS 344 9 


RSDNLAR 3 94 9 


1 . 


397 


1418 


CTGGAGTTG 


2450 


RADALMV 2 950 


RSDNLAR 34 5 0 


RSDALRE 3 95 0 


0 . 


241 


1422 


GGGTCGGCG 


2451 


RSDELTR 2 951 


RSDDLTT 34 51 


RSDHLSR 3 951 


0 . 


064 


1423 


GGGTCGGCG 


2452 


RSDELTR 2 9 52 


RSDDLTK 34 52 


RSDHLSR 3 952 


0 . 


034 


1424 


CAGGGCCCG 


2453 


RSDELRE 2 953 


DRSHLAR 34 53 


RSDNLRE 3 953 


1 


. 37 


1427 


CAGGGCCCG 


2454 


RSDDLRE 2 9 54 


DRSHLAR 3 4 54 


RSDNLTE 3 9 54 


0 . 


271 


1428 


TGAGGCGAG 


2455 


RSDNLAR 2 955 


DRSHLAR 3455 


QSVHLQS 3 95 5 


0 . 


102 


1429 


TGAGGCGAG 


2456 


RSDNLAR 2 95 6 


DRSHLAR 345 6 


QSGHLTT 3 95 6 


0 . 


074 


1430 


TCGGCCGCC 


2457 


ERGTLAR 2 9 5 7 


DRSDLTR 3457 


RSDDLTK 3 95 7 


0 . 


352 


1431 


TCGGCCGCC 


2458 


ERGTLAR 2 95 8 


DRSDLTR 345 8 


RSDDLAS 3 95 8 


6 


. 17 


1432 


TCGGCCGCC 


2459 


ERGTLAR 2 95 9 


ERGTLAR 34 5 9 


RSDDLTK 3 959 


1 . 


778 


1434 


CTGGCCGTG 


2460 


RSDSLLR 2 960 


DRSDLTR 34 6 0 


RSDALRE 3 960 


0 . 


051 


1435 


TAATGGGGG 


2461 


RSDHLSR 2 961 


RSDHLTT 34 61 


QSGNLTK 3 961 


0 . 


057 


1436 


TGGGAGTGT 


2462 


TSDHLAS 2 962 


RSDNLAR 34 62 


RSDHLTT 3 962 


0 . 


026 


1439 


GGAGTGTTA 


2463 


QRSALAS 2 963 


RSDALAR 34 63 


QSGHLQR 3 963 


0 . 


075 


1440 


GGAGTGTTA 


2464 


QSGALTK 2 964 


RSDALAR 34 64 


QSGHLQR 3 964 


0 . 


035 




ATAGCTGGG 




RSDHLSR 2 965 


QSSDLTR 34 65 


QSGALTQ 3 9 65 


0 . 


2 62 


1442 


TGCTGGGCC 


2466 


ERGTLAR 2 966 


RSDHLTT 34 6 6 


DRSHLTK 3 966 


0 


.36 


1443 


TGGAAGGAA 


2467 


QSGNLAR 2 967 


RSDNLTQ 34 67 


RSHHLTT 3 967 


0 


.22 


1444 


TGGAAGGAA 


2468 


QSGNLAR 2 968 


RSDNLTQ 34 68 


RSSHLTT 3 96 8 


0 


. 09 


1445 


TGGAAGGAA 


2469 


QSGNLAR 2 969 


RLDNLTA 34 6 9 


RSHHLTT 3 969 


0 . 


182 



53 



1446 


TGGAAGGAA 


2470 


QSGNLAR 2 97 0 


RLDNLTA 34 7 0 


RSSHLTT 3 97 0 


0 .42 


1454 


GGAGAGGCT 


2471 


QSSDLRR 2 971 


RSDNLAR 3471 


QSGHLQR 3 971 


0 . 01 


1455 


CGGGATGAA 


2472 


QSANLSR 2 972 


TSGNLVR 34 72 


RSDHLRE 3 972 


0 . 043 


1456 


GGAGAGGCT 


2473 


QSSDLRR 2 973 


RSDNLAR 34 73 


QRAHLAR 3 973 


0 . 016 


1457 


GCAGAGGAA 


2474 


QSANLSR 2 974 


RSDNLAR 34 74 


QSGSLTR 3974 


0 . 014 


1460 


TTGGGGGAG 


2475 


RSDNLAR 2 975 


RSDHLTR 34 75 


RADALMV 3 975 


0 . 007 


1461 


GACGAGGAG 


2476 


RSANLAR 2 976 


RSDNLTR 347 6 


DRSNLTR 3 976 


0 . 014 


1462 


CGGGATGAA 


2477 


QSGNLAR 2 977 


TSGNLVR 3 47 7 


RSDHLRE 3 977 


0 . 05 


1463 


GAGGCTGTT 


2478 


TTSALTR 2 97 8 


QSSDLTR 3478 


RSDNLAR 3 978 


0 . 003 


1464 


GACGAGGAG 


2479 


RSDNLAR 2 97 9 


RSDNLTR 34 7 9 


DRSNLTR 3 97 9 


0 . 002 


1465 


CTGGGAGTT 


2480 


TTSALTR 2 98 0 


QSGHLQR 34 8 0 


RSDALRE 3 98 0 


0 .018 


1466 


CTGGGAGTT 


2481 


NRATLAR 2 981 


QSGHLQR 34 81 


RSDALRE 3 981 


0 .017 


1468 


GGTGATGTC 


2482 


DRSALTR 2 982 


TSGNLVR 34 82 


MSHHLSR 3 982 


0 . 08 


1469 


GGTGATGTC 


2483 


DRSALTR 2 983 


TSGNLVR 34 83 


TSGHLVR 3 983 


0.28 


1470 


GGTGATGTC 


2484 


DRSALTR 2 984 


TSGNLVR 34 84 


QRAHLER 3 984 


0 . 156 


1471 


CTGGTTGGG 


2485 


RSDHLSR 2 985 


QSSALTR 34 85 


RSDALRE 3 985 


0 . 09 


1472 


TTGAAGGTT 


2486 


TTSALTR 2 98 6 


RSDNLTQ 34 8 6 


RADALMV 3 98 6 


3 .22 


1473 


TTGAAGGTT 


2487 


TTSALTR 2 98 7 


RSDNLTQ 34 8 7 


RSDSLTT 3 98 7 


0 .47 


1474 


TTGAAGGTT 


2488 


QSSALAR 2 98 8 


RSDNLTQ 34 8 8 


RADALMV 3 988 


1 .39 


1475 


TTGAAGGTT 


2489 


QSSALAR 2 98 9 


RSDNLTQ 34 8 9 


RLHSLTT 3 98 9 


0.39 


1476 


TTGAAGGTT 


2490 


QSSALAR 2 990 


RSDNLTQ 34 9 0 


RSDSLTT 3 9 90 


0.305 


1477 


GCAGCCCGG 


2491 


RSDHLRE 2 991 


DRSDLTR 34 91 


QSGSLTR 3 991 


2.31 


1479 


GAAAGTTCA 


2492 


QSHDLTK 2 992 


MSHHLTQ 34 92 


QSGNLAR 3 9 92 


37 . 04 


1480 


GAAAGTTCA 


2493 


NKTDLGK 2 993 


TSGHLVQ 34 93 


QSGNLAR 3 993 


62 .5 


1481 


GAAAGTTCA 


2494 


NKTDLGK 2 994 


TSDHLAS 34 94 


RSDELRE 3 9 94 


37. 04 


1482 


CCGTGTGAC 


2495 


DRSNLTR 2 995 


TSDHLAS 3 4 95 


RSDELRE 3 9 95 


111 . 1 


1483 


CCGTGTGAC 


2496 


DRSNLTR 2 996 


MSHHLTT 34 96 


RSDELRE 3 996 


20 . 8 


1484 


GAAGTGGTA 


2497 


QSSSLVR 2997 


RSDALSR 34 97 


QSGNLAR 3 9 97 


0 . 01 








QSSDLRR 2998 


QSGHLTT 34 98 


RSDNLTQ 3 9 98 


1.537 


1486 


GGGTTTGAC 


2499 


DRSNLTR 2 999 


TTSALAS 34 99 


RSDHLSR 3 99 9 


0 .085 


1487 


TTGAAGGTT 


2500 


TTSALTR 3 0 00 


RSDNLTQ 3 500 


RLHSLTT 4 0 0 0 


0.188 


1488 


AAGTGGTAG 


2501 


QSSDLRR 3001 


QSGHLTT 3 501 


RLDNRTQ 4001 


5 .64 


1490 


CTGGTTGGG 


2502 


RSDHLSR 3 0 02 


TSGSLTR 3502 


RSDALRE 4 0 02 


0 . 04 
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14 91 AAGGGTTCA 2 503 NKTDLGK 3 0 03 
14 92 AAGTGGTAG 2 5 04 RSDNLTT 3 0 04 
14 93 AAGTGGTAG 2 5 05 RSDNLTT 3 0 05 
14 94 GGGTTTGAC 2 506 DRSNLTR 3 0 0 6 
14 96 TTGGGGGAG 2507 RSDNLAR 3 0 0 7 
14 97 GAGGCTCTT 2508 QSSALAR 3008 

1498 GAGGTTGAT 2509 QSSNLAR 3 009 

1499 GAGGTTGAT 2 510 QSSNLAR 3 010 
150 0 GCAGAGGAA 2 511 QSGNLAR 3 011 
1522 GCAATGGGT 2512 TSGHLVR 3 012 



DSSKLSR 3503 


RLDNRTA 4003 


4 


12 


RSDHLTT 3 5 04 


RSDNLTQ 4 0 04 


1 


37 


RSDHLTT 3 5 05 


RLDNRTQ 4 0 05 


15 


. 09 


QRSALAS 3 5 0 6 


RSDHLSR 4 0 06 


0 . 


255 


RSDHLTR 35 07 


RSDALTT 4 007 


0 . 


065 


QSSDLTR 3 508 


RSDNLAR 4008 


0 . 


007 


QSSALTR 3 509 


RSDNLAR 4009 


0 . 


101 


TSGALTR 3 510 


RSDNLAR 4 010 


0 


. 02 


RSDNLAR 3 511 


QSGSLTR 4 011 


0 . 


003 


RSDALTQ 3 512 


QSGDLTR 4 012 


0 


.08 
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TABLE 6 





FINGER (N ■> C) 


TRIPLET (5'->3') 


Fl 


F2 


F3 


AGG 






RXDHXXQ 


ATG 






RXDAXXQ 


CGG 






d YnHYYP 

I Y /\ L V 11 /V /Y ll. 


GAA 














nYsMYYR 




"RXDNXXT? 


RXSNXXR 
RXDNXXR 


RXDNXXR 


GAT 


QXSNXXR 
TXSNXXR 
TXGNXXR 


TXGNXXR 




GCA 


QXGSXXR 


QXGDXXR 




GCC 


EXGTXXR 






GCG 


RXDEXXR 


RXDEXXR 


RXDEXXR 
RXDTXXK 


GCT 


QXSDXXR 


TXGEXXR 
QXSDXXR 




GGA 




QXGHXXR 


QXAHXXR 


GGC 


DXSHXXR 


DXSHXXR 




GGG 


RXDHXXR 


RXDHXXR 


RXDHXXR 
RXDHXXK 


GGT 






TXGHXXR 


GTA 




QXGSXXR 
QXATXXR 




GTG 


RXDAXXR 
RXDSXXR 


RXDAXXR 


RXDAXXR 


TAG 




RXDNXXT 




TCG 


RXDDXXK 






TGT 




TXDHXXS 





WHAT IS CLAIMED IS: 
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1 . A zinc finger which binds to a target subsite wherein amino acids - 
1 through +6 of the zinc finger and the nucleotide sequence of the target subsite are as 
specified in Table 6. 

5 2. A zinc finger according to claim 1 wherein amino acids -1 through 

+6 of the zinc finger have the sequence DXSNXXR and the nucleotide sequence of the 
target subsite is GAC. 

3. A zinc finger according to claim 1 wherein amino acids -1 through 
+6 of the zinc finger have the sequence RX(D/S)NXXR and the nucleotide sequence of 

10 the target subsite is GAG. 

4. A zinc finger according to claim 1 wherein amino acids -1 through 
+6 of the zinc finger have the sequence TXGNXXR and the nucleotide sequence of the 
target subsite is GAT. 

5. A zinc finger according to claim 1 wherein amino acids -1 through 
15 +6 of the zinc finger have the sequence (Q/T)XSNXXR and the nucleotide sequence of 

the target subsite is GAT. 

6. A zinc finger according to claim 1 wherein amino acids -1 through 
+6 of the zinc finger have the sequence QXG(S/D)XXR and the nucleotide sequence of 
the target subsite is GCA. 

20 7. A zinc finger according to claim 1 wherein amino acids -1 through 

+6 of the zinc finger have the sequence RXDEXXR and the nucleotide sequence of the 
target subsite is GCG. 

8. A zinc finger according to claim 1 wherein amino acids -1 through 
+6 of the zinc finger have the sequence QXDSXXR and the nucleotide sequence of the 
25 target subsite is GCT. 
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9. A zinc finger according to claim 1 wherein amino acids -1 through 
+6 of the zinc finger have the sequence QX(G/A)HXXR and the nucleotide sequence of 
the target subsite is GGA. 

10. A zinc finger according to claim 1 wherein amino acids -1 through 
5 +6 of the zinc finger have the sequence DXSHXXR and the nucleotide sequence of the 

target subsite is GGC. 

11. A zinc finger according to claim 1 wherein amino acids -1 through 
+6 of the zinc finger have the sequence RXDHXXR and the nucleotide sequence of the 
target subsite is GGG. 

10 12. A zinc finger according to claim 1 wherein amino acids -1 through 

+6 of the zinc finger have the sequence RXDAXXR and the nucleotide sequence of the 
target subsite is GTG. 

13. A nucleic acid encoding a polypeptide wherein the polypeptide 
comprises a zinc fmger according to claim 1 . 

15 14. A segment of a zinc finger comprising a sequence of seven 

contiguous amino acids as shown in any of Tables 1-5. 

15. A nucleic acid encoding a polypeptide wherein the polypeptide 
comprises a segment of a zinc finger according to claim 14. 

16. A zinc finger protein comprising first, second and third zinc 

20 fingers, wherein the zinc fingers comprise respectively first, second and third segments of 
seven contiguous amino acids as shown in a row of Tables 1-5. 

17. A nucleic acid encoding a zinc finger protein according to claim 

16. 
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ABSTRACT OF THE DISCLOSURE 
The invention provides nonnaturally occurring zinc finger proteins, and 
corresponding target sites bound by the proteins. Consensus amino acid sequences for 
design of zinc fingers having a given target substite binding specificity are also provided. 
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